Re: [HACKERS] JSON for PG 9.2
On Tue, Dec 13, 2011 at 1:13 PM, Merlin Moncure mmonc...@gmail.com wrote: Mozilla SpiderMonkey seems like a good fit: it compiles to a dependency free .so, has excellent platform support, has a stable C API, and while it's C++ internally makes no use of exceptions (in fact, it turns them off in the c++ compiler). ISTM to be a suitable foundation for an external module, 'in core' parser, pl, or anything really. When I started to think about PL/js, I compared three of SpiderMonkey, SquirrelFish, and V8. SpiderMonkey at that time (around 2009) was not-fast, not-small in memory while what you raise, as well as its advanced features like JS1.7 (pure yield!), was attractive. Also SpiderMonkey was a little harder to build in arbitrary platform (including Windows) than v8. SquirrelFish was fastest of three, but yet it's sticky with Webkit and also hard to build itself. Dunno how they've changed since then. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Pavel Stehule wrote: so removed quite option and removed multiple check regression tests also - there is missing explicit order of function checking, so regress tests can fail :( There seems to be a problem with the SET clause of CREATE FUNCTION: ftest=# CREATE OR REPLACE FUNCTION a(integer) RETURNS integer LANGUAGE plpgsql AS 'BEGIN RETURN 2*$1; END'; CREATE FUNCTION ftest=# CHECK FUNCTION a(integer); NOTICE: checked function a(integer) CHECK FUNCTION ftest=# CREATE OR REPLACE FUNCTION a(integer) RETURNS integer LANGUAGE plpgsql SET search_path=public AS 'BEGIN RETURN 2*$1; END'; CREATE FUNCTION ftest=# CHECK FUNCTION a(integer); The connection to the server was lost. Attempting reset: Failed. ! Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Hello 2011/12/15 Albe Laurenz laurenz.a...@wien.gv.at: Pavel Stehule wrote: so removed quite option and removed multiple check regression tests also - there is missing explicit order of function checking, so regress tests can fail :( There seems to be a problem with the SET clause of CREATE FUNCTION: ftest=# CREATE OR REPLACE FUNCTION a(integer) RETURNS integer LANGUAGE plpgsql AS 'BEGIN RETURN 2*$1; END'; CREATE FUNCTION ftest=# CHECK FUNCTION a(integer); NOTICE: checked function a(integer) CHECK FUNCTION ftest=# CREATE OR REPLACE FUNCTION a(integer) RETURNS integer LANGUAGE plpgsql SET search_path=public AS 'BEGIN RETURN 2*$1; END'; CREATE FUNCTION ftest=# CHECK FUNCTION a(integer); The connection to the server was lost. Attempting reset: Failed. ! There was bug - missing detoast call fixed Regards Pavel Yours, Laurenz Albe check_function-2011-12-15-1.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] includeifexists in configuration file
On 12/12/2011 04:47 PM, Andrew Dunstan wrote: I have briefly looked at the code (but not tried to apply or build it), and modulo the naming issue it looks OK to me. Unless there is some other issue let's just get it applied. It looks like almost a no-brainer to me. It isn't very fancy, but is does something people that can fit into a couple of use-cases. Attached update has two changes to address the suggestions I got, which closes everything I knew about with this one: -It's now include_if_exists -Files that are skipped are logged now So current behavior: $ tail -n 1 postgresql.conf include 'missing.conf' $ start server starting $ tail $PGLOG LOG: could not open configuration file /home/gsmith/pgwork/data/include-exists/missing.conf: No such file or directory FATAL: configuration file /home/gsmith/pgwork/data/include-exists/postgresql.conf contains errors And new behavior: $ vi $PGDATA/postgresql.conf $ tail -n 1 postgresql.conf include_if_exists 'missing.conf' $ start server starting $ tail $PGLOG LOG: skipping missing configuration file /home/gsmith/pgwork/data/include-exists/missing.conf LOG: database system was shut down at 2011-12-15 06:48:46 EST LOG: database system is ready to accept connections -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index d1e628f..0cc3296 100644 *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** include 'filename' *** 91,96 --- 91,108 para indexterm + primaryliteralinclude_if_exists//primary + secondaryin configuration file/secondary + /indexterm + Use the same approach as the literalinclude/ directive, continuing + normally if the file does not exist. A regular literalinclude/ + will stop with an error if the referenced file is missing, while + literalinclude_if_exists/ does not. A warning about the missing + file will be logged. +/para + +para + indexterm primarySIGHUP/primary /indexterm The configuration file is reread whenever the main server process receives a diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l index a094c7a..6ba130c 100644 *** a/src/backend/utils/misc/guc-file.l --- b/src/backend/utils/misc/guc-file.l *** ProcessConfigFile(GucContext context) *** 129,135 /* Parse the file into a list of option names and values */ head = tail = NULL; ! if (!ParseConfigFile(ConfigFileName, NULL, 0, elevel, head, tail)) { /* Syntax error(s) detected in the file, so bail out */ error = true; --- 129,135 /* Parse the file into a list of option names and values */ head = tail = NULL; ! if (!ParseConfigFile(ConfigFileName, NULL, true, 0, elevel, head, tail)) { /* Syntax error(s) detected in the file, so bail out */ error = true; *** ProcessConfigFile(GucContext context) *** 363,369 * and absolute-ifying the path name if necessary. */ bool ! ParseConfigFile(const char *config_file, const char *calling_file, int depth, int elevel, ConfigVariable **head_p, ConfigVariable **tail_p) --- 363,369 * and absolute-ifying the path name if necessary. */ bool ! ParseConfigFile(const char *config_file, const char *calling_file, bool strict, int depth, int elevel, ConfigVariable **head_p, ConfigVariable **tail_p) *** ParseConfigFile(const char *config_file, *** 414,424 fp = AllocateFile(config_file, r); if (!fp) { ! ereport(elevel, ! (errcode_for_file_access(), ! errmsg(could not open configuration file \%s\: %m, ! config_file))); ! return false; } OK = ParseConfigFp(fp, config_file, depth, elevel, head_p, tail_p); --- 414,430 fp = AllocateFile(config_file, r); if (!fp) { ! if (strict) ! { ! ereport(elevel, ! (errcode_for_file_access(), ! errmsg(could not open configuration file \%s\: %m, ! config_file))); ! return false; ! } ! ! elog(LOG, skipping missing configuration file \%s\, config_file); ! return OK; } OK = ParseConfigFp(fp, config_file, depth, elevel, head_p, tail_p); *** ParseConfigFp(FILE *fp, const char *conf *** 512,518 } /* OK, process the option name and value */ ! if (guc_name_compare(opt_name, include) == 0) { /* * An include directive isn't a variable and should be processed --- 518,541 } /* OK, process the option name and value */ ! if (guc_name_compare(opt_name, include_if_exists) == 0) ! { ! /* ! * An include_if_exists directive isn't a variable and should be ! * processed immediately. ! */ ! unsigned int save_ConfigFileLineno = ConfigFileLineno; ! ! if (!ParseConfigFile(opt_value, config_file, false, !
Re: [HACKERS] IDLE in transaction introspection
This patch seems closing in on being done, but it's surely going to take at least one more round of review to make sure all the naming and documentation is up right. I can work on that again whenever Scott gets another version necessary, and Magnus is already poking around with an eye toward possibly committing the result. I think we can make this one returned with feedback for this CF, but encourage Scott to resubmit as early as feasible before the next one. With some good luck, we might even close this out before that one even starts. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Measuring relation free space
On 11/28/2011 05:40 AM, Greg Smith wrote: Ignoring fillfactor seems to have even more downsides as I see it. Certainly deserves a doc improvement, as well as fixing the description of the value so it's clearly a ratio rather than a true percentage. So: I'm very clear on what to do here now: -Make the computation be in units that match it documetnation -Take a look at other index types, as well as TOAST, at least to get the easy ones right. -Fully confirm the extension upgrade logic works as hoped That's the must do stuff. Then there's two more features to consider and do something with if sensible: -Double check whether there is really a useful role in using pg_freespace here. I don't think the numbers will be as good, but maybe we don't care. -Once the above is all sorted, add a second UI that samples some pages and extrapolates, ANALYZE-style, rather than hitting everything. This ones leaves as returned with feedback, feeling pretty good it will be whipped into good shape for the last 9.2 CommitFest. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace
On 12/13/2011 12:29 PM, Julien Tachoires wrote: 2011/12/13 Robert Haasrobertmh...@gmail.com: On Tue, Dec 13, 2011 at 12:02 PM, Julien Tachoiresjul...@gmail.com wrote: Right, it seems to happen when the destination tablespace is the same as the database's tbs, because, in this case, relation's tbs is set to InvalidOid : src/backend/commands/tablecmds.c line 8342 + rd_rel-reltablespace = (newTableSpace == MyDatabaseTableSpace) ? InvalidOid : newTableSpace; Why don't just asign newTableSpace value here ? When a relation is stored in the default tablespace, we always record that in the system catalogs as InvalidOid. Otherwise, if the database's default tablespace were changed, things would break. OK, considering that, I don't see any way to handle the case raised by Jaime :( So we have a problem here: there's a case that's messy to handle. And there's really a large issue hanging over this whole patch, which is that it needs a better explanation of what exactly it's going to get used for. Especially if the implementation gets more complicated, we'd want to see a clear reason to use this feature. And that's not really clear. If you can return with an update that perhaps finds a way to work around this OID issue, please re-submit that. And if you can explain some more about where you think this feature is useful, more information on that would be helpful. Since this isn't going to get committed soon, I'm going to mark it returned with feedback for now. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Hello one small update - better emulation of environment for security definer functions Regards Pavel check_function-2011-12-15-2.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race condition in HEAD, possibly due to PGPROC splitup
On Wed, Dec 14, 2011 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Pavan Deolasee pavan.deola...@gmail.com writes: Looking at CommitTransaction(), it seems quite clear to me that we call ProcArrayEndTransaction() before releasing the locks held by the transaction. So its quite possible that when GetRunningTransactionLocks goes through the list of currently held locks, the pgxact-xid is already cleared. This seems to a old bug to me and not related to PGXACT work. Hm. So maybe the correct fix is to deem the lock already released if we get zero when we read the xid? It's not clear to me what the requirements for GetRunningTransactionLocks actually are, but if it's okay for it to think a lock is released slightly ahead of when the rest of the system thinks so, that would work. Attached patch closes both race conditions: * where xid is zero * where xid is non-zero yet WAL record for the commit of xid wins race ahead of the WAL record for locks Patch fixes it in backwards compatible way. No version increments. Patch head, 9.1 and 9.0. Will wait a couple of days for additional testing. Comments? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services *** a/src/backend/storage/ipc/procarray.c --- b/src/backend/storage/ipc/procarray.c *** *** 499,505 ProcArrayApplyRecoveryInfo(RunningTransactions running) * Remove stale transactions, if any. */ ExpireOldKnownAssignedTransactionIds(running-oldestRunningXid); ! StandbyReleaseOldLocks(running-oldestRunningXid); /* * If our snapshot is already valid, nothing else to do... --- 499,505 * Remove stale transactions, if any. */ ExpireOldKnownAssignedTransactionIds(running-oldestRunningXid); ! StandbyReleaseOldLocks(running-xcnt, running-xids); /* * If our snapshot is already valid, nothing else to do... *** *** 554,565 ProcArrayApplyRecoveryInfo(RunningTransactions running) */ /* - * Release any locks belonging to old transactions that are not running - * according to the running-xacts record. - */ - StandbyReleaseOldLocks(running-nextXid); - - /* * Nobody else is running yet, but take locks anyhow */ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); --- 554,559 *** a/src/backend/storage/ipc/standby.c --- b/src/backend/storage/ipc/standby.c *** *** 525,531 StandbyAcquireAccessExclusiveLock(TransactionId xid, Oid dbOid, Oid relOid) LOCKTAG locktag; /* Already processed? */ ! if (TransactionIdDidCommit(xid) || TransactionIdDidAbort(xid)) return; elog(trace_recovery(DEBUG4), --- 525,533 LOCKTAG locktag; /* Already processed? */ ! if (!TransactionIdIsValid(xid) || ! TransactionIdDidCommit(xid) || ! TransactionIdDidAbort(xid)) return; elog(trace_recovery(DEBUG4), *** *** 607,640 StandbyReleaseLockTree(TransactionId xid, int nsubxids, TransactionId *subxids) } /* ! * StandbyReleaseLocksMany ! * Release standby locks held by XIDs removeXid ! * ! * If keepPreparedXacts is true, keep prepared transactions even if ! * they're older than removeXid */ ! static void ! StandbyReleaseLocksMany(TransactionId removeXid, bool keepPreparedXacts) { ListCell *cell, *prev, *next; LOCKTAG locktag; - /* - * Release all matching locks. - */ prev = NULL; for (cell = list_head(RecoveryLockList); cell; cell = next) { xl_standby_lock *lock = (xl_standby_lock *) lfirst(cell); next = lnext(cell); ! if (!TransactionIdIsValid(removeXid) || TransactionIdPrecedes(lock-xid, removeXid)) { - if (keepPreparedXacts StandbyTransactionIdIsPrepared(lock-xid)) - continue; elog(trace_recovery(DEBUG4), releasing recovery lock: xid %u db %u rel %u, lock-xid, lock-dbOid, lock-relOid); --- 609,691 } /* ! * Called at end of recovery and when we see a shutdown checkpoint. */ ! void ! StandbyReleaseAllLocks(void) ! { ! ListCell *cell, ! *prev, ! *next; ! LOCKTAG locktag; ! ! elog(trace_recovery(DEBUG2), release all standby locks); ! ! prev = NULL; ! for (cell = list_head(RecoveryLockList); cell; cell = next) ! { ! xl_standby_lock *lock = (xl_standby_lock *) lfirst(cell); ! ! next = lnext(cell); ! ! elog(trace_recovery(DEBUG4), ! releasing recovery lock: xid %u db %u rel %u, ! lock-xid, lock-dbOid, lock-relOid); ! SET_LOCKTAG_RELATION(locktag, lock-dbOid, lock-relOid); ! if (!LockRelease(locktag, AccessExclusiveLock, true)) ! elog(LOG, ! RecoveryLockList contains entry for lock no longer recorded by lock manager: xid %u database %u relation %u, ! lock-xid, lock-dbOid, lock-relOid); ! RecoveryLockList = list_delete_cell(RecoveryLockList, cell, prev); ! pfree(lock); ! } ! } ! ! /* ! * StandbyReleaseOldLocks ! * Release standby locks held by XIDs that aren't
[HACKERS] why do we need create tuplestore for each fetch?
Hi everyone, I found this several days ago when I try to debug a fetch of cursor. And I have sent a mail to this list, but no one reply... Maybe this is a very simple problem, please help me, thanks a lot... Here is the example: create table t (a int); insert into t values (1),(3),(5),(7),(9); insert into t select a+1 from t; begin; declare c cursor for select * from t order by a; fetch 3 in c; fetch 3 in c; fetch 3 in c; In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT, and then a tuplestore will be created in 'FillPortalStore' in the fetch stmt's portal. In 'FillPortalStore', all result will be store at that tuplestore, Then, go back to 'PortalRun'; next, 'PortalRunSelect' will send this results to client... My problem is: why do we need create that tuplestore as an middle storeage? why do not we just send these result to clent at the first time? Thank you very much. -- GaoZengqi pgf...@gmail.com zengqi...@gmail.com
Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace
Excerpts from Julien Tachoires's message of mar dic 13 14:29:56 -0300 2011: 2011/12/13 Robert Haas robertmh...@gmail.com: On Tue, Dec 13, 2011 at 12:02 PM, Julien Tachoires jul...@gmail.com wrote: Right, it seems to happen when the destination tablespace is the same as the database's tbs, because, in this case, relation's tbs is set to InvalidOid : src/backend/commands/tablecmds.c line 8342 + rd_rel-reltablespace = (newTableSpace == MyDatabaseTableSpace) ? InvalidOid : newTableSpace; Why don't just asign newTableSpace value here ? When a relation is stored in the default tablespace, we always record that in the system catalogs as InvalidOid. Otherwise, if the database's default tablespace were changed, things would break. OK, considering that, I don't see any way to handle the case raised by Jaime :( Uhm, surely you could compare the original toast tablespace to the heap tablespace, and if they differ, handle appropriately when creating the new toast table? Just pass down the toast tablespace into AlterTableCreateToastTable, instead of having it assume that rel-rd_rel-relnamespace is sufficient. This should be done in all cases where a toast tablespace is created, which shouldn't be more than a handful of them. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Moving more work outside WALInsertLock
I've been looking at various ways to make WALInsertLock less of a bottleneck on multi-CPU servers. The key is going to be to separate the two things that are done while holding the WALInsertLock: a) allocating the required space in the WAL, and b) calculating the CRC of the record header and copying the data to the WAL page. a) needs to be serialized, but b) could be done in parallel. I've been experimenting with different approaches to do that, but one thing is common among all of them: you need to know the total amount of WAL space needed for the record, including backup blocks, before you take the lock. So, here's a patch to move things around in XLogInsert() a bit, to accomplish that. This patch doesn't seem to have any performance or scalability impact. I must admit I expected it to give a tiny gain in scalability by shortening the time WALInsertLock is held by a few instructions, but I can't measure any. But IMO it makes the code more readable, so this is worthwhile for that reason alone. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 690,695 XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata) --- 690,698 uint32 freespace; int curridx; XLogRecData *rdt; + XLogRecData *rdt_cur; + XLogRecData *rdt_bkp_first; + XLogRecData *rdt_bkp_last; Buffer dtbuf[XLR_MAX_BKP_BLOCKS]; bool dtbuf_bkp[XLR_MAX_BKP_BLOCKS]; BkpBlock dtbuf_xlg[XLR_MAX_BKP_BLOCKS]; *** *** 704,709 XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata) --- 707,713 bool updrqst; bool doPageWrites; bool isLogSwitch = (rmid == RM_XLOG_ID info == XLOG_SWITCH); + uint8 info_final; /* cross-check on whether we should be here or not */ if (!XLogInsertAllowed()) *** *** 727,738 XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata) } /* ! * Here we scan the rdata chain, determine which buffers must be backed ! * up, and compute the CRC values for the data. Note that the record ! * header isn't added into the CRC initially since we don't know the final ! * length or info bits quite yet. Thus, the CRC will represent the CRC of ! * the whole record in the order rdata, then backup blocks, then record ! * header. * * We may have to loop back to here if a race condition is detected below. * We could prevent the race by doing all this work while holding the --- 731,738 } /* ! * Here we scan the rdata chain to determine which buffers must be backed ! * up. * * We may have to loop back to here if a race condition is detected below. * We could prevent the race by doing all this work while holding the *** *** 746,751 XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata) --- 746,752 * over the chain later. */ begin:; + info_final = info; for (i = 0; i XLR_MAX_BKP_BLOCKS; i++) { dtbuf[i] = InvalidBuffer; *** *** 760,766 begin:; */ doPageWrites = fullPageWrites || Insert-forcePageWrites; - INIT_CRC32(rdata_crc); len = 0; for (rdt = rdata;;) { --- 761,766 *** *** 768,774 begin:; { /* Simple data, just include it */ len += rdt-len; - COMP_CRC32(rdata_crc, rdt-data, rdt-len); } else { --- 768,773 *** *** 779,790 begin:; { /* Buffer already referenced by earlier chain item */ if (dtbuf_bkp[i]) rdt-data = NULL; else if (rdt-data) - { len += rdt-len; - COMP_CRC32(rdata_crc, rdt-data, rdt-len); - } break; } if (dtbuf[i] == InvalidBuffer) --- 778,789 { /* Buffer already referenced by earlier chain item */ if (dtbuf_bkp[i]) + { rdt-data = NULL; + rdt-len = 0; + } else if (rdt-data) len += rdt-len; break; } if (dtbuf[i] == InvalidBuffer) *** *** 796,807 begin:; { dtbuf_bkp[i] = true; rdt-data = NULL; } else if (rdt-data) - { len += rdt-len; - COMP_CRC32(rdata_crc, rdt-data, rdt-len); - } break; } } --- 795,804 { dtbuf_bkp[i] = true; rdt-data = NULL; + rdt-len = 0; } else if (rdt-data) len += rdt-len; break; } } *** *** 816,854 begin:; } /* ! * Now add the backup block headers and data into the CRC */ for (i = 0; i XLR_MAX_BKP_BLOCKS; i++) { ! if (dtbuf_bkp[i]) { ! BkpBlock *bkpb = (dtbuf_xlg[i]); ! char *page; ! ! COMP_CRC32(rdata_crc, ! (char *) bkpb, ! sizeof(BkpBlock)); ! page = (char *) BufferGetBlock(dtbuf[i]); ! if (bkpb-hole_length == 0) ! { ! COMP_CRC32(rdata_crc, ! page, !
Re: [HACKERS] Race condition in HEAD, possibly due to PGPROC splitup
On Thu, Dec 15, 2011 at 8:19 AM, Simon Riggs si...@2ndquadrant.com wrote: Comments? I think there is a small bug here: + TransactionId xid = pgxact-xid; + + /* +* Don't record locks for transactions if we know they have already +* issued their WAL record for commit but not yet released lock. +* It is still possible that we see locks held by already complete +* transactions, if they haven't yet zeroed their xids. +*/ + if (!TransactionIdIsValid(xid)) + continue; accessExclusiveLocks[index].xid = pgxact-xid; accessExclusiveLocks[index].dbOid = lock-tag.locktag_field1; ...because you're fetching pgxact-xid, testing whether it's valid, and then fetching it again. It could change in the meantime. You probably want to change the assignment to read: accessExclusiveLocks[index].xid = xid; Also, we should probably change this pointer to be declared volatile: PGXACT *pgxact = ProcGlobal-allPgXact[proc-pgprocno]; Otherwise, I think the compiler might get cute and decide to fetch the xid twice anyway, effectively undoing our attempt to pull it into a local variable. I think this comment could be clarified in some way, to make it more clear that we had a bug at one point, and it was fixed - the from a time when they were possible language wouldn't be entirely clear to me after the fact: + * Zero xids should no longer be possible, but we may be replaying WAL + * from a time when they were possible. It would probably make sense to break out of this loop if a match is found: ! for (i = 0; i nxids; i++) ! { ! if (lock-xid == xids[i]) ! found = true; ! } I'm not sure I fully understand the rest of this, but those are the only things I noticed on a read-through. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On Thursday, December 15, 2011 02:51:33 PM Heikki Linnakangas wrote: I've been looking at various ways to make WALInsertLock less of a bottleneck on multi-CPU servers. The key is going to be to separate the two things that are done while holding the WALInsertLock: a) allocating the required space in the WAL, and b) calculating the CRC of the record header and copying the data to the WAL page. a) needs to be serialized, but b) could be done in parallel. I've been experimenting with different approaches to do that, but one thing is common among all of them: you need to know the total amount of WAL space needed for the record, including backup blocks, before you take the lock. So, here's a patch to move things around in XLogInsert() a bit, to accomplish that. This patch doesn't seem to have any performance or scalability impact. I must admit I expected it to give a tiny gain in scalability by shortening the time WALInsertLock is held by a few instructions, but I can't measure any. But IMO it makes the code more readable, so this is worthwhile for that reason alone. Thats great! I did (or at least tried) something similar when I was playing around with another crc32 implementation (which I plan to finish sometime). My changes where totally whacky but I got rather big improvements when changing the crc computation from incremental to one big swoop. I started to hack up an api which buffered xlog data in statically sized buffer in each backend and only submitted that every now and then. Never got that to actually work correctly in more than the simplest cases though ;). In many cases were taking the wal insert lock way to often during a single insert... (you obviously know that...) Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
On Wed, Dec 14, 2011 at 5:44 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: I can get behind this argument: force all stuff through ProcessUtility for regularity, and not necessarily in the first patch for this feature. That seems like a pretty heavy dependency on an implementation detail that we might want to change at some point. Given ProcessUtility_hook, how much of an implementation detail rather than a public API are we talking about? I think that a hook and an SQL command are not on the same level. Hooks are not documented; SQL commands are. You may, of course, disagree. But the basic point is that it seems pretty weird to say, on the one hand, these are command triggers. They fire when you execute a command. So the CREATE TABLE trigger will fire when someone says CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a command trigger any more; it's a trigger that fires when you perform a certain operation - e.g. create a table. Unless, of course, you create the table using CREATE TABLE AS SELECT or SELECT .. INTO. Then it doesn't fire. Unless we decide to make those utility commands, which I think was just recently under discussion, in which case it will suddenly start firing for those operations. So now something that is, right now, an essentially an implementation detail which we can rearrange as we like turns into a user-visible behavior change. And what if some day we want to change it back? I think it would be a much better idea to decree from the beginning that we're trapping the *operation* of creating a table, rather than the *command* create table. Then, the behavior is clear and well-defined from day one, and it doesn't depend on how we happen to implement things internally right at the moment. If there's some way of creating a table without firing the trigger, it's a bug. If there's some way of firing the trigger without attempting to create a table, it's a bug. That might require some more thought about what information to pass to the trigger function (e.g. if it's a SELECT .. INTO, you're not going to have pregenerated SQL that starts with the words CREATE TABLE) but the fact that gives much more clear definition to the core feature seems like a big plus in my book. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Wed, Dec 14, 2011 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: So for streaming replication, will I need to have a standby.enabled file, or will there be a parameter in postgresql.conf (or included files) which controls whether or not that server is a standby, available? In the best of all possible worlds, I'd really like to have a GUC which 100% controls whether or not the current server is a standby. I think that would be a bad idea, because then how would pg_ctl promote work? It'd have to permanently change the value of that GUC, which means rewriting a postgresql.conf file with an arbitrary forest of comments and include files, and we can't do that now or, probably, ever. At least not reliably enough for this kind of use case. I think what Greg is going for is this: 1. Get rid of recovery.conf - error out if it is seen 2. For each parameter that was previously a recovery.conf parameter, make it a GUC 3. For the parameter that was does recovery.conf exist?, replace it with does standby.enabled exist?. IMHO, as Greg says, that's as much change as we can cope with in one release. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race condition in HEAD, possibly due to PGPROC splitup
On Thu, Dec 15, 2011 at 2:13 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Dec 15, 2011 at 8:19 AM, Simon Riggs si...@2ndquadrant.com wrote: Comments? I think there is a small bug here: + TransactionId xid = pgxact-xid; + + /* + * Don't record locks for transactions if we know they have already + * issued their WAL record for commit but not yet released lock. + * It is still possible that we see locks held by already complete + * transactions, if they haven't yet zeroed their xids. + */ + if (!TransactionIdIsValid(xid)) + continue; accessExclusiveLocks[index].xid = pgxact-xid; accessExclusiveLocks[index].dbOid = lock-tag.locktag_field1; ...because you're fetching pgxact-xid, testing whether it's valid, and then fetching it again. It could change in the meantime. You probably want to change the assignment to read: accessExclusiveLocks[index].xid = xid; Also, we should probably change this pointer to be declared volatile: PGXACT *pgxact = ProcGlobal-allPgXact[proc-pgprocno]; Otherwise, I think the compiler might get cute and decide to fetch the xid twice anyway, effectively undoing our attempt to pull it into a local variable. I think this comment could be clarified in some way, to make it more clear that we had a bug at one point, and it was fixed - the from a time when they were possible language wouldn't be entirely clear to me after the fact: + * Zero xids should no longer be possible, but we may be replaying WAL + * from a time when they were possible. It would probably make sense to break out of this loop if a match is found: ! for (i = 0; i nxids; i++) ! { ! if (lock-xid == xids[i]) ! found = true; ! } I'm not sure I fully understand the rest of this, but those are the only things I noticed on a read-through. Thanks, useful changes. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
On Thursday, December 15, 2011 03:36:25 PM Robert Haas wrote: On Wed, Dec 14, 2011 at 5:44 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: I can get behind this argument: force all stuff through ProcessUtility for regularity, and not necessarily in the first patch for this feature. That seems like a pretty heavy dependency on an implementation detail that we might want to change at some point. Given ProcessUtility_hook, how much of an implementation detail rather than a public API are we talking about? I think that a hook and an SQL command are not on the same level. Hooks are not documented; SQL commands are. You may, of course, disagree. I am not disagreeing. But the basic point is that it seems pretty weird to say, on the one hand, these are command triggers. They fire when you execute a command. So the CREATE TABLE trigger will fire when someone says CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a command trigger any more; it's a trigger that fires when you perform a certain operation - e.g. create a table. Unless, of course, you create the table using CREATE TABLE AS SELECT or SELECT .. INTO. Then it doesn't fire. Unless we decide to make those utility commands, which I think was just recently under discussion, in which case it will suddenly start firing for those operations. Command triggers were the initial reason for me doing that conversion. So now something that is, right now, an essentially an implementation detail which we can rearrange as we like turns into a user-visible behavior change. And what if some day we want to change it back? I think it would be a much better idea to decree from the beginning that we're trapping the *operation* of creating a table, rather than the *command* create table. Then, the behavior is clear and well-defined from day one, and it doesn't depend on how we happen to implement things internally right at the moment. I am with you there. If there's some way of creating a table without firing the trigger, it's a bug. If there's some way of firing the trigger without attempting to create a table, it's a bug. Again. That might require some more thought about what information to pass to the trigger function (e.g. if it's a SELECT .. INTO, you're not going to have pregenerated SQL that starts with the words CREATE TABLE) but the fact that gives much more clear definition to the core feature seems like a big plus in my book. Not sure what youre getting at here? The command tag in Dim's patch is alread independent of the form of actual statement that was run. A big +1 on the general direction of this email. I dislike reducing the scope of command triggers to top level commands run by the actual user because imo that would reduce the possible scope of the patch rather much. On the other hand I think delivering a complete patch covering just about anything triggered anywhere is not really realistic. Not sure whats the best way to continue is. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting plpython into smaller parts
How to people feel about naming the files (as proposed) ! OBJS = plpython.o plpython_io.o plpython_procedure.o plpython_exec.o \ !plpython_plpy.o plpython_spi.o plpython_result.o plpython_cursor.o \ !plpython_plan.o plpython_subtransaction.o plpython_functions.o \ !plpython_elog.o vs. say ! OBJS = main.o io.o procedure.o exec.o plpy.o spi.o result.o cursor.o \ !plan.o subtransaction.o functions.o elog.o ? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting plpython into smaller parts
Excerpts from Peter Eisentraut's message of jue dic 15 12:00:13 -0300 2011: How to people feel about naming the files (as proposed) ! OBJS = plpython.o plpython_io.o plpython_procedure.o plpython_exec.o \ !plpython_plpy.o plpython_spi.o plpython_result.o plpython_cursor.o \ !plpython_plan.o plpython_subtransaction.o plpython_functions.o \ !plpython_elog.o vs. say ! OBJS = main.o io.o procedure.o exec.o plpy.o spi.o result.o cursor.o \ !plan.o subtransaction.o functions.o elog.o ? I find the extra prefix unnecessary and ugly; if we had to had a prefix, I'd choose a shorter one (maybe py instead of plpython_). -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting plpython into smaller parts
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Peter Eisentraut's message of jue dic 15 12:00:13 -0300 2011: How to people feel about naming the files (as proposed) ! OBJS = plpython.o plpython_io.o plpython_procedure.o plpython_exec.o \ !plpython_plpy.o plpython_spi.o plpython_result.o plpython_cursor.o \ !plpython_plan.o plpython_subtransaction.o plpython_functions.o \ !plpython_elog.o vs. say ! OBJS = main.o io.o procedure.o exec.o plpy.o spi.o result.o cursor.o \ !plan.o subtransaction.o functions.o elog.o ? I find the extra prefix unnecessary and ugly; if we had to had a prefix, I'd choose a shorter one (maybe py instead of plpython_). +1 for a prefix, mainly because the shorter names duplicate some names already in use elsewhere in our tree. But I agree with Alvaro that py would be sufficient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I've been experimenting with different approaches to do that, but one thing is common among all of them: you need to know the total amount of WAL space needed for the record, including backup blocks, before you take the lock. So, here's a patch to move things around in XLogInsert() a bit, to accomplish that. This patch may or may not be useful, but this description of it is utter nonsense, because we already do compute that before taking the lock. Please try again to explain what you're doing? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Hi, Thank your Robert for this continued review, I think we're making good progress in a community discussion that needs to happen! Robert Haas robertmh...@gmail.com writes: Given ProcessUtility_hook, how much of an implementation detail rather than a public API are we talking about? I think that a hook and an SQL command are not on the same level. Hooks are not documented; SQL commands are. You may, of course, disagree. Mmmm, yes. I think that we should document hooks, and I'm going to propose soon a pg_extension_module() SRF that lists all currently loaded modules and which extension implements them, and I've begun thinking about what it would take to be able to list what hooks each module is implementing. That would require an hook registration API, and would allow a much easier security auditing of a setup you don't know beforehand. So I think that hooks not being documented is an implementation detail here :) But the basic point is that it seems pretty weird to say, on the one hand, these are command triggers. They fire when you execute a command. So the CREATE TABLE trigger will fire when someone says CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a Yes, this CREATE SCHEMA any create command is weird, and unique, so it's not that difficult to document, I think. And if we fix things by having all subcommands go through ProcessUtility and command triggers, then it's easy to document as the new rule. My documentation idea would then be explaining what is a command and what is a subcommand, and then the current rule (command triggers do not support subcommands) and then the exception to that rule (CREATE SCHEMA subcommands do, in fact, support command triggers). If we decide that we should in fact support (nested) subcommands in command triggers, then we will be in a position to prepare a patch implementing that with a documentation change that the rule is now that command triggers do in fact support subcommands. When the command trigger is called on a subcommand, I think you will want both the main command string and the subcommand string, and we have to research if what you need isn't a whole stack of commands, because I'm not sure you can only have 1 level deep nesting here. That would be done with a special API for the trigger functions, and with a specific syntax, because it seems to me that having a trigger code that is only ever called on a top-level command is a good thing to have. create trigger foo after command CREATE TABLE … create trigger foo after top level command CREATE TABLE … create trigger foo after nested command CREATE TABLE … Either we add support for that kind of syntax now (and not implementing it in 9.3 would seem weird as hell) or we instruct pg_dump and pg_upgrade to switch from current syntax to the new one (add in the “top level” keywords) when we do implement the feature down the road. command trigger any more; it's a trigger that fires when you perform a certain operation - e.g. create a table. Unless, of course, you create the table using CREATE TABLE AS SELECT or SELECT .. INTO. Then it doesn't fire. Unless we decide to make those utility commands, which I think was just recently under discussion, in which case it will suddenly start firing for those operations. So now something Andres has been sending a complete patch to fix that old oddity of the parser. And again, that's a very narrow exception to the usual state of things, and as such, easy to document in the list of things that don't fall into the general rule. Even when fixed, though, you have 3 different command tags to deal with, and we identify the command triggers on command tags, so that a command trigger on CREATE TABLE will not be called when doing SELECT INTO, nor when doing CREATE TABLE AS. Also, I think that the command triggers in 9.2 will not address all and any command that PostgreSQL offers (think “alter operator family”), so that we will need to maintain an exhaustive list of supported commands, identified by command tags. that is, right now, an essentially an implementation detail which we can rearrange as we like turns into a user-visible behavior change. And what if some day we want to change it back? Yes, we need to make a decision about that now. Do we want any “operation” to go through ProcessUtility so that hooks and command triggers can get called? I think it's a good idea in the long run, and Alvaro seems to be thinking it is too. That entails changing the backend code to build a Statement then call ProcessUtility on it rather than calling the internal functions directly, and that also means some more DDL are subject to being logged on the server logs. Removing those “cross-modules” #include might be a good think in the long run though. And we don't have to do that unless we want to make subcommands available to ProcessUtility_hook and
Re: [HACKERS] Command Triggers
On Thursday, December 15, 2011 04:53:15 PM Dimitri Fontaine wrote: Hi, Thank your Robert for this continued review, I think we're making good progress in a community discussion that needs to happen! Robert Haas robertmh...@gmail.com writes: Given ProcessUtility_hook, how much of an implementation detail rather than a public API are we talking about? I think that a hook and an SQL command are not on the same level. Hooks are not documented; SQL commands are. You may, of course, disagree. Mmmm, yes. I think that we should document hooks, and I'm going to propose soon a pg_extension_module() SRF that lists all currently loaded modules and which extension implements them, and I've begun thinking about what it would take to be able to list what hooks each module is implementing. I don't really see that as possible/realistic. But the basic point is that it seems pretty weird to say, on the one hand, these are command triggers. They fire when you execute a command. So the CREATE TABLE trigger will fire when someone says CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a Yes, this CREATE SCHEMA any create command is weird, and unique, so it's not that difficult to document, I think. And if we fix things by having all subcommands go through ProcessUtility and command triggers, then it's easy to document as the new rule. I don't think I ever saw that one in real worldddl scripts ;) My documentation idea would then be explaining what is a command and what is a subcommand, and then the current rule (command triggers do not support subcommands) and then the exception to that rule (CREATE SCHEMA subcommands do, in fact, support command triggers). If we decide that we should in fact support (nested) subcommands in command triggers, then we will be in a position to prepare a patch implementing that with a documentation change that the rule is now that command triggers do in fact support subcommands. When the command trigger is called on a subcommand, I think you will want both the main command string and the subcommand string, and we have to research if what you need isn't a whole stack of commands, because I'm not sure you can only have 1 level deep nesting here. I don't think you need that. If needed you will have to build the data structure in $pl. That would be done with a special API for the trigger functions, and with a specific syntax, because it seems to me that having a trigger code that is only ever called on a top-level command is a good thing to have. create trigger foo after command CREATE TABLE … create trigger foo after top level command CREATE TABLE … create trigger foo after nested command CREATE TABLE … Either we add support for that kind of syntax now (and not implementing it in 9.3 would seem weird as hell) or we instruct pg_dump and pg_upgrade to switch from current syntax to the new one (add in the “top level” keywords) when we do implement the feature down the road. I personally think there should only be one variant which is always called. With a parameter that indicates whether its a subcommand or not. Why would you ever only want top level commands? that is, right now, an essentially an implementation detail which we can rearrange as we like turns into a user-visible behavior change. And what if some day we want to change it back? Yes, we need to make a decision about that now. Do we want any “operation” to go through ProcessUtility so that hooks and command triggers can get called? I personally think thats a good idea for most stuff. I don't see that for alter table subcommands and such though. I think it's a good idea in the long run, and Alvaro seems to be thinking it is too. That entails changing the backend code to build a Statement then call ProcessUtility on it rather than calling the internal functions directly, and that also means some more DDL are subject to being logged on the server logs. Removing those “cross-modules” #include might be a good think in the long run though. Uhm. I don't think building strings is the way to go here. I think building *Stmt nodes is better. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On Thu, Dec 15, 2011 at 7:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I've been experimenting with different approaches to do that, but one thing is common among all of them: you need to know the total amount of WAL space needed for the record, including backup blocks, before you take the lock. So, here's a patch to move things around in XLogInsert() a bit, to accomplish that. This patch may or may not be useful, but this description of it is utter nonsense, because we already do compute that before taking the lock. Please try again to explain what you're doing? Currently the CRC of all the data minus the header is computed outside the lock, but then the header's computation is added and the CRC is finalized inside the lock. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Andres Freund and...@anarazel.de writes: create trigger foo after command CREATE TABLE … create trigger foo after top level command CREATE TABLE … create trigger foo after nested command CREATE TABLE … Either we add support for that kind of syntax now (and not implementing it in 9.3 would seem weird as hell) or we instruct pg_dump and pg_upgrade to switch from current syntax to the new one (add in the “top level” keywords) when we do implement the feature down the road. I personally think there should only be one variant which is always called. With a parameter that indicates whether its a subcommand or not. Why would you ever only want top level commands? Because the command create table foo(id primary key) could then fire your index and constraint command triggers twice and if all you do is storing the command string in a table for auditing purposes, maybe you just don't care. Yes, we need to make a decision about that now. Do we want any “operation” to go through ProcessUtility so that hooks and command triggers can get called? I personally think thats a good idea for most stuff. I don't see that for alter table subcommands and such though. By subcommand, I mean any operation that a main command do for you and that you could have been doing manually with another command, such as serial and sequences, primary key and its alter table form, embedded checks or not null and its alter table form, etc. I don't remember that ALTER TABLE implement facilities that are in turn calling another command for you? I think it's a good idea in the long run, and Alvaro seems to be thinking it is too. That entails changing the backend code to build a Statement then call ProcessUtility on it rather than calling the internal functions directly, and that also means some more DDL are subject to being logged on the server logs. Removing those “cross-modules” #include might be a good think in the long run though. Uhm. I don't think building strings is the way to go here. I think building *Stmt nodes is better. Agreed, I meant that exactly. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
Jeff Janes jeff.ja...@gmail.com writes: On Thu, Dec 15, 2011 at 7:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: This patch may or may not be useful, but this description of it is utter nonsense, because we already do compute that before taking the lock. Please try again to explain what you're doing? Currently the CRC of all the data minus the header is computed outside the lock, but then the header's computation is added and the CRC is finalized inside the lock. Quite. AFAICS that is not optional, unless you are proposing to remove the prev_link from the scope of the CRC, which is not exactly a penalty-free change. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgstat wait timeout
Hi, I am having a scenario where I get consistent warnings in the pglog folder: 2011-12-11 00:00:03 JST WARNING: pgstat wait timeout 2011-12-11 00:00:14 JST WARNING: pgstat wait timeout 2011-12-11 00:00:24 JST WARNING: pgstat wait timeout 2011-12-11 00:00:31 JST WARNING: pgstat wait timeout 2011-12-11 00:00:44 JST WARNING: pgstat wait timeout 2011-12-11 00:00:52 JST WARNING: pgstat wait timeout 2011-12-11 00:01:03 JST WARNING: pgstat wait timeout 2011-12-11 00:01:11 JST WARNING: pgstat wait timeout . . . This is impacting database performance. The issue persists even when I use the database minimally. I have tried fine-tuning the Auto-vacuum parameters: Change the parameter autovacuum_vacuum_cost_delay to 40ms ::: Issue is reproduced 4 hours after this change Change the parameter autovacuum_max_workers to 20 ::: I got the warning message 2 times at times 17:20:12 and 17:20:20. After this, no warnings for 5 hours. Then I tried: Change the parameter autovacuum_vacuum_cost_delay to: 60ms, Change the parameter autovacuum_max_workers to: 10::: I got the warning message 2 times at times 17:20:12 and 17:20:20. After this, no warnings for 5 hours Any Ideas? -- View this message in context: http://postgresql.1045698.n5.nabble.com/pgstat-wait-timeout-tp5078125p5078125.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IDLE in transaction introspection
On Thu, Dec 15, 2011 at 13:18, Greg Smith g...@2ndquadrant.com wrote: This patch seems closing in on being done, but it's surely going to take at least one more round of review to make sure all the naming and documentation is up right. I can work on that again whenever Scott gets another version necessary, and Magnus is already poking around with an eye toward possibly committing the result. I think we can make this one returned with feedback for this CF, but encourage Scott to resubmit as early as feasible before the next one. With some good luck, we might even close this out before that one even starts. My hope was to get a quick update from Scott and then apply it. But feel free to set it to returned, but Scott - don't delay until the next CF, just post it when you're ready and I'll pick it up in between the two CFs when I find a moment. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout
On 15 Prosinec 2011, 17:55, pratikchirania wrote: Hi, I am having a scenario where I get consistent warnings in the pglog folder: 2011-12-11 00:00:03 JST WARNING: pgstat wait timeout 2011-12-11 00:00:14 JST WARNING: pgstat wait timeout 2011-12-11 00:00:24 JST WARNING: pgstat wait timeout 2011-12-11 00:00:31 JST WARNING: pgstat wait timeout 2011-12-11 00:00:44 JST WARNING: pgstat wait timeout 2011-12-11 00:00:52 JST WARNING: pgstat wait timeout 2011-12-11 00:01:03 JST WARNING: pgstat wait timeout 2011-12-11 00:01:11 JST WARNING: pgstat wait timeout This is impacting database performance. It's rather a sign that the I/O is overloaded, although in some cases it may actually be the cause. The issue persists even when I use the database minimally. Yes, because the file is written periodically - twice a second IIRC. If the file is large, this may be an issue. What is the pgstat.stat size (should be in data/global). I have tried fine-tuning the Auto-vacuum parameters: Autovacuum has nothing to do with this. Any Ideas? Move the file to a RAM drive - there's even a config parameter 'stats_temp_directory' to do that. See http://www.postgresql.org/docs/9.1/static/runtime-config-statistics.html Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout
On Thu, Dec 15, 2011 at 18:13, Tomas Vondra t...@fuzzy.cz wrote: On 15 Prosinec 2011, 17:55, pratikchirania wrote: Hi, I am having a scenario where I get consistent warnings in the pglog folder: 2011-12-11 00:00:03 JST WARNING: pgstat wait timeout 2011-12-11 00:00:14 JST WARNING: pgstat wait timeout 2011-12-11 00:00:24 JST WARNING: pgstat wait timeout 2011-12-11 00:00:31 JST WARNING: pgstat wait timeout 2011-12-11 00:00:44 JST WARNING: pgstat wait timeout 2011-12-11 00:00:52 JST WARNING: pgstat wait timeout 2011-12-11 00:01:03 JST WARNING: pgstat wait timeout 2011-12-11 00:01:11 JST WARNING: pgstat wait timeout This is impacting database performance. It's rather a sign that the I/O is overloaded, although in some cases it may actually be the cause. The issue persists even when I use the database minimally. Yes, because the file is written periodically - twice a second IIRC. If the file is large, this may be an issue. What is the pgstat.stat size (should be in data/global). That was only true prior to 8.4. As of 8.4 it's only written when necessary, which is usually a lot less than twice / second. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout
On 15 Prosinec 2011, 18:19, Magnus Hagander wrote: On Thu, Dec 15, 2011 at 18:13, Tomas Vondra t...@fuzzy.cz wrote: On 15 Prosinec 2011, 17:55, pratikchirania wrote: Hi, I am having a scenario where I get consistent warnings in the pglog folder: 2011-12-11 00:00:03 JST WARNING: pgstat wait timeout 2011-12-11 00:00:14 JST WARNING: pgstat wait timeout 2011-12-11 00:00:24 JST WARNING: pgstat wait timeout 2011-12-11 00:00:31 JST WARNING: pgstat wait timeout 2011-12-11 00:00:44 JST WARNING: pgstat wait timeout 2011-12-11 00:00:52 JST WARNING: pgstat wait timeout 2011-12-11 00:01:03 JST WARNING: pgstat wait timeout 2011-12-11 00:01:11 JST WARNING: pgstat wait timeout This is impacting database performance. It's rather a sign that the I/O is overloaded, although in some cases it may actually be the cause. The issue persists even when I use the database minimally. Yes, because the file is written periodically - twice a second IIRC. If the file is large, this may be an issue. What is the pgstat.stat size (should be in data/global). That was only true prior to 8.4. As of 8.4 it's only written when necessary, which is usually a lot less than twice / second. Thanks for the correction. Nevertheless, it would be useful to know what is the size of the file and what is the I/O usage. Pratik, can you post the size of the pgstat.stat file and post a few lines of iostat -x 1 collected when the pgstat wait timeout happens? Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On Tue, Dec 13, 2011 at 8:15 AM, Joey Adams joeyadams3.14...@gmail.com wrote: Issues we've encountered include: * Should JSON be stored as binary or as text? Text * How do we deal with Unicode escapes and characters if the server or client encoding is not UTF-8? Some (common!) character encodings have code points that don't map to Unicode. Also, the charset conversion modules do not provide fast entry points for converting individual characters; each conversion involves a funcapi call. Make JSON datatypes only selectable if client encoding is UTF-8. Lets JFDI -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout
Size of pgstat.stat file: 86KB I did not understand the second part. Where do I get iostat -x 1 message? (Its not present in any file in the pg_log folder) I am using postgres 9.0.1 -- View this message in context: http://postgresql.1045698.n5.nabble.com/pgstat-wait-timeout-tp5078125p5078391.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On 15.12.2011 18:48, Tom Lane wrote: Jeff Janesjeff.ja...@gmail.com writes: On Thu, Dec 15, 2011 at 7:34 AM, Tom Lanet...@sss.pgh.pa.us wrote: This patch may or may not be useful, but this description of it is utter nonsense, because we already do compute that before taking the lock. Please try again to explain what you're doing? Currently the CRC of all the data minus the header is computed outside the lock, but then the header's computation is added and the CRC is finalized inside the lock. Quite. AFAICS that is not optional, Right, my patch did not change that. unless you are proposing to remove the prev_link from the scope of the CRC, which is not exactly a penalty-free change. We could CRC the rest of the record header before getting the lock, though, and only include the prev-link while holding the lock. I micro-benchmarked that a little bit, but didn't see much benefit from doing just that. Once you do more drastic changes so that the lock doesn't need to be held while copying the data and calculating the CRC of the record header, so that those things can be done in parallel, it matters even less. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout
On 15 Prosinec 2011, 19:42, pratikchirania wrote: Size of pgstat.stat file: 86KB That's pretty small. I did not understand the second part. Where do I get iostat -x 1 message? (Its not present in any file in the pg_log folder) iostat is not part of PostgreSQL, it's a tool used to display various I/O metrics in Linux (and Unix in general). What OS are you using? It seems the I/O subsystem is so busy it can't write the pgstat.stat on time, so a warning is printed. You need to find out why the I/O is so overutilized. I am using postgres 9.0.1 That's way too old. Upgrade to 9.0.6. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On 15.12.2011 17:34, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: I've been experimenting with different approaches to do that, but one thing is common among all of them: you need to know the total amount of WAL space needed for the record, including backup blocks, before you take the lock. So, here's a patch to move things around in XLogInsert() a bit, to accomplish that. This patch may or may not be useful, but this description of it is utter nonsense, because we already do compute that before taking the lock. Nope. Without this patch, the total length including the backup blocks, write_len, is added up in the loop that creates the rdata entries for backup blocks. That is done while holding the lock (see code beginning with comment Make additional rdata chain entries for the backup blocks). Admittedly you could sum up the total length quite easily in the earlier loop, before we grab the lock, where we calculate the CRC of the backup blocks (Now add the backup block headers and data into the CRC). That would be a smaller patch. Please try again to explain what you're doing? Ok: I'm moving the creation of rdata entries for backup blocks outside the critical section, so that it's done before grabbing the lock. I'm also moving the CRC calculation so that it's done after all the rdata entries have been created, including the ones for backup blocks. It's more readable to do it that way, as a separate step, instead of sprinkling the COMP_CRC macros in many places. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2011-11 Update
On Sat, Dec 10, 2011 at 1:07 PM, Greg Smith g...@2ndquadrant.com wrote: There's a normal sized plate lined up for Robert since he's been keeping up better; in no particular order: -More review on the always a new surprise Fix Leaky Views Problem, again -Extra fun with Tuplesort comparison overhead reduction -His own avoid taking two snapshots per query and FlexLocks improvements Greg, Thanks for working through this. I see I still need to look more at the leaky views stuff. I think that the tuplesort comparison overhead reduction is in Peter G's hands to rebase at the moment; I will look at that again when it reemerges. I think avoid two snapshots per query is pretty much ready to go, though I'm vaguely waiting for any further input from Dimitri. I believe I'm going to punt on FlexLocks for now, for the reasons set out in the email I just sent on that topic, and accordingly am marking that patch with feedback. One more time to make sure I made my main point: thanks for working on this. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IDLE in transaction introspection
On Thu, Dec 15, 2011 at 12:10 PM, Magnus Hagander mag...@hagander.netwrote: On Thu, Dec 15, 2011 at 13:18, Greg Smith g...@2ndquadrant.com wrote: This patch seems closing in on being done, but it's surely going to take at least one more round of review to make sure all the naming and documentation is up right. I can work on that again whenever Scott gets another version necessary, and Magnus is already poking around with an eye toward possibly committing the result. I think we can make this one returned with feedback for this CF, but encourage Scott to resubmit as early as feasible before the next one. With some good luck, we might even close this out before that one even starts. My hope was to get a quick update from Scott and then apply it. But feel free to set it to returned, but Scott - don't delay until the next CF, just post it when you're ready and I'll pick it up in between the two CFs when I find a moment. Thanks guys, I should have something by this weekend. Sorry for the delay. --Scott -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Command Triggers
On Thu, Dec 15, 2011 at 05:46:05PM +0100, Dimitri Fontaine wrote: Andres Freund and...@anarazel.de writes: Yes, we need to make a decision about that now. Do we want any ???operation??? to go through ProcessUtility so that hooks and command triggers can get called? I personally think thats a good idea for most stuff. I don't see that for alter table subcommands and such though. By subcommand, I mean any operation that a main command do for you and that you could have been doing manually with another command, such as serial and sequences, primary key and its alter table form, embedded checks or not null and its alter table form, etc. I don't remember that ALTER TABLE implement facilities that are in turn calling another command for you? When ALTER TABLE ALTER TYPE changes an indexed column, it updates the index by extracting its SQL definition, dropping it, and running that SQL. (Not sure whether it passes through the code paths to hit your trigger.) We ought not to promise that ALTER TABLE will always do so. By comparison, we could implement ALTER TABLE SET NOT NULL on an inheritance tree as several ALTER TABLE ONLY, but we don't implement it that way. Triggering on top-level commands can be quite well-defined; triggers on subcommands (as you have defined the term) will risk firing at different times across major versions. That may just be something to document. I think we'll need a way to tell SPI whether a command should be considered a subcommand or not. A PL/pgSQL function that calls CREATE TABLE had better fire a trigger, but some SPI usage will qualify as subcommands. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
On Thu, Dec 15, 2011 at 10:53 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Yes, we need to make a decision about that now. Do we want any “operation” to go through ProcessUtility so that hooks and command triggers can get called? No. That's 100% backwards. We should first decide what functionality we want, and then decide how we're going to implement it. If we arbitrarily decide to force everything that someone might want to write a trigger on through ProcessUtility_hook, then we're going to end up being unable to add triggers for some things because they can't be conveniently forced through ProcessUtility, or else we're going to end up contorting the code in bizarre ways because we've drawn some line in the sand that ProcessUtility is the place where triggers have to get called. In doing this project, I think we should pay a lot of attention to the lessons that have been learned developing sepgsql. I can certainly understand if your eyes roll back in your head when you hear that, because that project has been exceedingly long and difficult and shows no sign of reaching its full potential for at least another few release cycles. But I think it's really worth the effort to avoid pounding our heads against the brick wall twice. Two things that leap out at me in this regard are: (1) It's probably a mistake to assume that we only need one interface. sepgsql has several hooks, and will need more before the dust settles. We have one hook for checking permissions on table names that appear in DML queries, a second for applying security labels just after a new SQL object is created, and a third for adjusting the security context when an sepgsql trusted procedure is invoked. In a similar way, I think it's probably futile to think that we can come up with a one-size-fits-all interface where every command (or operation) trigger can accept the same arguments. CREATE TABLE is going to want to know different stuff than LOCK TABLE or ALTER OPERATOR FAMILY, and trigger writers are going to want a *convenient* API to that information, not just the raw query text. (2) It's almost certainly a mistake to assume that everything you want to trigger on is a command. For example, somebody might want to get control whenever a table gets added to a column, either at table create time or later. I don't think most of us would consider CREATE TABLE bob (a int, b int) to be a create-a-table-with-no-columns operation plus two add-a-column-to-a-table operations. But being able to enforce a column naming policy is no less useful than being able to enforce a table naming policy, and there are other things you might want to do there as well (logging, updating metadata, prohibiting use of certain types, etc.). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Robert Haas robertmh...@gmail.com writes: No. That's 100% backwards. We should first decide what functionality we want, and then decide how we're going to implement it. If we arbitrarily decide to force everything that someone might want to write a trigger on through ProcessUtility_hook, then we're going to end up being unable to add triggers for some things because they can't be conveniently forced through ProcessUtility, or else we're going to end up contorting the code in bizarre ways because we've drawn some line in the sand that ProcessUtility is the place where triggers have to get called. In theory you're right. In practice, we're talking about utility command triggers, that fires on a top-level command. We're now enlarging the talk about what to do with sub-commands, that is things done by a command as part of its implementation but that you could have done separately with another finer grained dedicated top-level command. I'm not wanting to implement a general ”event” trigger mechanism where anyone can come and help define the set of events, and I think what you're talking about now amounts to be doing that. Here again, trying to generalize before we have anything useful is a recipe for failure. I concur that “Process Utility Top-Level Only Command Triggers” is a pretty limited feature in scope, yet that's what I want to obtain here, and I think it's useful enough on its own. If you disagree, please propose a user level scheme where we can fit the work I'm doing so that I can adapt my patch and implement a part of your scheme in a future proof way. I'm ready to do that even when I have no need for what you're talking about, if that's what it takes. (1) It's probably a mistake to assume that we only need one interface. [... useful sepgsql history ...] trigger can accept the same arguments. CREATE TABLE is going to want to know different stuff than LOCK TABLE or ALTER OPERATOR FAMILY, and trigger writers are going to want a *convenient* API to that information, not just the raw query text. Are you familiar with the nodeToString() output? That's close to a full blown XML, JSON or YAML document that you can easily use from a program. Command triggers functions are not given just a raw text. When trying to define a more complex API in the line of what you're referencing here, back at the Cluster Hackers Developer Meeting, it was devised that the nodeToString() output is all you need. For having written code that produces it and code that consumes it, Jan has been very clear that you hardly can do better than that while still being generic enough. to enforce a column naming policy is no less useful than being able to enforce a table naming policy, and there are other things you might want to do there as well (logging, updating metadata, prohibiting use of certain types, etc.). Are you familiar with the nodeToString() output? What makes you think that the uses cases you propose here are hard to implement in a command trigger when given this parse tree string? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Measuring relation free space
On Sun, Nov 06, 2011 at 10:20:49PM +0100, Bernd Helmle wrote: --On 6. November 2011 01:08:11 -0200 Greg Smith g...@2ndquadrant.com wrote: Attached patch adds a new function to the pageinspect extension for measuring total free space, in either tables or indexes. I wonder if that should be done in the pgstattuple module, which output some similar numbers. Indeed, pgstattuple already claims to show precisely the same measure. Its reckoning is right in line for heaps, but the proposed pageinspect function finds more free space in indexes: [local] test=# SELECT t.free_percent, relation_free_space('pg_proc'), i.free_percent, relation_free_space('pg_proc_proname_args_nsp_index') FROM pgstattuple('pg_proc') t, pgstattuple('pg_proc_proname_args_nsp_index') i; free_percent | relation_free_space | free_percent | relation_free_space --+-+--+- 2.53 | 0.0253346 | 8.61 |0.128041 (1 row) Is one of those index figures simply wrong, or do they measure two senses of free space, both of which are interesting to DBAs? Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On 12/15/2011 01:34 PM, Simon Riggs wrote: On Tue, Dec 13, 2011 at 8:15 AM, Joey Adamsjoeyadams3.14...@gmail.com wrote: Issues we've encountered include: * Should JSON be stored as binary or as text? Text Works for me. * How do we deal with Unicode escapes and characters if the server or client encoding is not UTF-8? Some (common!) character encodings have code points that don't map to Unicode. Also, the charset conversion modules do not provide fast entry points for converting individual characters; each conversion involves a funcapi call. Make JSON datatypes only selectable if client encoding is UTF-8. Yuck. Do we have this sort of restriction for any other data type? ISTM that the encoding problem is at least as likely to be the reverse of what's above - i.e. that there's a code point in the stored JSON that's not represented in the client encoding. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On Thu, Dec 15, 2011 at 7:06 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Please try again to explain what you're doing? Ok: I'm moving the creation of rdata entries for backup blocks outside the critical section, so that it's done before grabbing the lock. I'm also moving the CRC calculation so that it's done after all the rdata entries have been created, including the ones for backup blocks. It's more readable to do it that way, as a separate step, instead of sprinkling the COMP_CRC macros in many places. There's a comment that says we can't undo the linking of the rdata chains, but it looks like a reversible process to me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On Thu, Dec 15, 2011 at 6:50 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: unless you are proposing to remove the prev_link from the scope of the CRC, which is not exactly a penalty-free change. We could CRC the rest of the record header before getting the lock, though, and only include the prev-link while holding the lock. I micro-benchmarked that a little bit, but didn't see much benefit from doing just that. Once you do more drastic changes so that the lock doesn't need to be held while copying the data and calculating the CRC of the record header, so that those things can be done in parallel, it matters even less. You missed your cue to discuss leaving the prev link out of the CRC altogether. On its own that sounds dangerous, but its not. When we need to confirm the prev link we already know what we expect it to be, so CRC-ing it is overkill. That isn't true of any other part of the WAL record, so the prev link is the only thing we can relax, but thats OK because we can CRC check everything else outside of the locked section. That isn't my idea, but I'm happy to put it on the table since I'm not shy. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RangeVarGetRelid()
On Fri, Dec 9, 2011 at 5:41 PM, Noah Misch n...@leadboat.com wrote: It also seems my last explanation didn't convey the point. Yes, nearly every command has a different set of permissions checks. However, we don't benefit equally from performing each of those checks before acquiring a lock. Consider renameatt(), which checks three things: you must own the relation, the relation must be of a supported relkind, and the relation must not be a typed table. To limit opportunities for denial of service, let's definitely perform the ownership check before taking a lock. The other two checks can wait until we hold that lock. The benefit of checking them early is to avoid making a careless relation owner wait for a lock before discovering the invalidity of his command. That's nice as far as it goes, but let's not proliferate callbacks for such a third-order benefit. I agree, but my point is that so far we have no callbacks that differ only in that detail. I accept that we'd probably want to avoid that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to allow users to kill their own queries
On Tue, Dec 13, 2011 at 5:59 AM, Greg Smith g...@2ndquadrant.com wrote: Same-user cancels, but not termination. Only this, and nothing more. +1 from me on this approach. I think enough people have clamored for this simple approach which solves the common-case. There's one obvious and questionable design decision I made to highlight. Right now the only consumers of pg_signal_backend are the cancel and terminate calls. What I did was make pg_signal_backend more permissive, adding the idea that role equivalence = allowed, and therefore granting that to anything else that might call it. And then I put a stricter check on termination. This results in a redundant check of superuser on the termination check, and the potential for mis-using pg_signal_backend. . . I think that's OK, it should be easy enough to reorganize if more callers to pg_signal_backend() happen to come along. The only niggling concern I have about this patch (and, I think, all similar ones proposed) is the possible race condition between the permissions checking and the actual call of kill() inside pg_signal_backend(). I fooled around with this by using gdb to attach to Backend #1, stuck a breakpoint right before the call to: if (kill(-pid, sig)) and ran a pg_cancel_backend() of a same-role Backend #2. Then, with the permissions checking passed, I exited Backend #2 and used a script to call fork() in a loop until my system PIDs had wrapped around to a few PIDs before the one Backend #2 had held. Then, I opened a few superuser connections until I had one with the same PID which Backend #2 previously had. After I released the breakpoint in gdb on Backend #1, it happily sent a SIGINT to my superuser backend. I notice that BackendPidGetProc() has the comment: ... Note that it is up to the caller to be sure that the question remains meaningful for long enough for the answer to be used ... So someone has mulled this over before. It took my script maybe 15-20 minutes to cause a wraparound by running fork() in a loop, and that wraparound would somehow have to occur within the short execution of pg_signal_backend(). I'm not super worried about the patch from a security perspective, just thought I'd point this out. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] includeifexists in configuration file
On 12/15/2011 06:54 AM, Greg Smith wrote: On 12/12/2011 04:47 PM, Andrew Dunstan wrote: I have briefly looked at the code (but not tried to apply or build it), and modulo the naming issue it looks OK to me. Unless there is some other issue let's just get it applied. It looks like almost a no-brainer to me. It isn't very fancy, but is does something people that can fit into a couple of use-cases. Attached update has two changes to address the suggestions I got, which closes everything I knew about with this one: -It's now include_if_exists -Files that are skipped are logged now So current behavior: $ tail -n 1 postgresql.conf include 'missing.conf' $ start server starting $ tail $PGLOG LOG: could not open configuration file /home/gsmith/pgwork/data/include-exists/missing.conf: No such file or directory FATAL: configuration file /home/gsmith/pgwork/data/include-exists/postgresql.conf contains errors And new behavior: $ vi $PGDATA/postgresql.conf $ tail -n 1 postgresql.conf include_if_exists 'missing.conf' $ start server starting $ tail $PGLOG LOG: skipping missing configuration file /home/gsmith/pgwork/data/include-exists/missing.conf LOG: database system was shut down at 2011-12-15 06:48:46 EST LOG: database system is ready to accept connections Committed. I changed the elog() call to use ereport(): you're not supposed to use elog() for things we expect might well happen and cause log entries - see bottom of http://www.postgresql.org/docs/current/static/error-message-reporting.html. I've probably been guilty of this in the past, it's a bit too easy to forget. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
Simon Riggs si...@2ndquadrant.com writes: You missed your cue to discuss leaving the prev link out of the CRC altogether. On its own that sounds dangerous, but its not. When we need to confirm the prev link we already know what we expect it to be, so CRC-ing it is overkill. That isn't true of any other part of the WAL record, so the prev link is the only thing we can relax, but thats OK because we can CRC check everything else outside of the locked section. That isn't my idea, but I'm happy to put it on the table since I'm not shy. I'm glad it's not your idea, because it's a bad one. A large part of the point of CRC'ing WAL records is to guard against torn-page problems in the WAL files, and doing things like that would give up a significant part of that protection, because there would no longer be any assurance that the body of a WAL record had anything to do with its prev_link. Consider a scenario like this: * We write a WAL record that starts 8 bytes before a sector boundary, so that the prev_link is in one sector and the rest of the record in the next one(s). * Time passes, and we recycle that WAL file. * We write another WAL record that starts 8 bytes before the same sector boundary, so that the prev_link is in one sector and the rest of the record in the next one(s). * System crashes, after having written out the earlier sector but not the later one(s). On restart, the replay code will see a prev_link that matches what it expects. If the CRC for the remainder of the record is not dependent on the prev_link, then the remainder of the old record will look good too, and we'll attempt to replay it, n*16MB too late. Including the prev_link in the CRC adds a significant amount of protection against such problems. We should not remove this protection in the name of shaving a few cycles. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Storing hot members of PGPROC out of the band
On Mon, Nov 7, 2011 at 7:28 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: While looking at GetSnapshotData(), it occurred to me that when a PGPROC entry does not have its xid set, ie. xid == InvalidTransactionId, it's pointless to check the subxid array for that proc. If a transaction has no xid, none of its subtransactions can have an xid either. That's a trivial optimization, see attached. I tested this with pgbench -S -M prepared -c 500 on the 8-core box, and it seemed to make a 1-2% difference (without the other patch). So, almost in the noise, but it also doesn't cost us anything in terms of readability or otherwise. Oh, that's a good idea. +1 for doing that now, and then we can keep working on the rest of it. I spent some time looking at (and benchmarking) this idea today. I rearranged that section of code a little more into what seemed like the optimal order to avoid doing more work than necessary, but I wasn't getting much of a gain out of it even on unlogged tables and on permanent tables it was looking like a small loss, though I didn't bother letting the benchmarks run for long enough to nail that down because it didn't seem to amount to much one way or the other. I added then added one more change that helped quite a lot: I introduced a macro NormalTransactionIdPrecedes() which works like TransactionIdPrecdes() but (a) is only guaranteed to work if the arguments are known to be normal transaction IDs (which it also asserts, for safety) and (b) is a macro rather than a function call. I found three places to use that inside GetSnapshotData(), and the results with that change look fairly promising. Nate Boley's box, m = master, s = with the attached patch, median of three 5-minute runs at scale factor 100, config same as my other recent tests: Permanent Tables m01 tps = 617.734793 (including connections establishing) s01 tps = 620.330099 (including connections establishing) m08 tps = 4589.566969 (including connections establishing) s08 tps = 4545.942588 (including connections establishing) m16 tps = 7618.842377 (including connections establishing) s16 tps = 7596.759619 (including connections establishing) m24 tps = 11770.534809 (including connections establishing) s24 tps = 11789.424152 (including connections establishing) m32 tps = 10776.660575 (including connections establishing) s32 tps = 10643.361817 (including connections establishing) m80 tps = 11057.353339 (including connections establishing) s80 tps = 10598.254713 (including connections establishing) Unlogged Tables m01 tps = 668.145495 (including connections establishing) s01 tps = 676.793337 (including connections establishing) m08 tps = 4715.214745 (including connections establishing) s08 tps = 4737.833913 (including connections establishing) m16 tps = 8110.607784 (including connections establishing) s16 tps = 8192.013414 (including connections establishing) m24 tps = 14120.753841 (including connections establishing) s24 tps = 14302.915040 (including connections establishing) m32 tps = 17886.032656 (including connections establishing) s32 tps = 18735.319468 (including connections establishing) m80 tps = 12711.930739 (including connections establishing) s80 tps = 17592.715471 (including connections establishing) Now, you might not initially find those numbers all that encouraging. Of course, the unlogged tables numbers are quite good, especially at 32 and 80 clients, where the gains are quite marked. But the permanent table numbers are at best a wash, and maybe a small loss. Interestingly enough, some recent benchmarking of the FlexLocks patch showed a similar (though more pronounced) effect: http://archives.postgresql.org/pgsql-hackers/2011-12/msg00095.php Now, both the FlexLocks patch and this patch aim to mitigate contention problems around ProcArrayLock. But they do it in completely different ways. When I got a small regression on permanent tables with the FlexLocks patch, I thought the problem was with the patch itself, which is believable, since it was tinkering with the LWLock machinery, a fairly global sort of change that you can well imagine might break something. But it's hard to imagine how that could possibly be the case here, especially given the speedups on unlogged tables, because this patch is dead simple and entirely isolated to GetSnapshotData(). So I have a new theory: on permanent tables, *anything* that reduces ProcArrayLock contention causes an approximately equal increase in WALInsertLock contention (or maybe some other lock), and in some cases that increase in contention elsewhere can cost more than the amount we're saving here. On that theory, I'm inclined to think that's not really a problem. We'll go nuts if we refuse to commit anything until it shows a meaningful win on every imaginable workload, and it seems like this can't really be worse than the status quo; any case where it is must
Re: [HACKERS] includeifexists in configuration file
On 12/15/2011 08:16 PM, Andrew Dunstan wrote: I changed the elog() call to use ereport(): you're not supposed to use elog() for things we expect might well happen and cause log entries - see bottom of http://www.postgresql.org/docs/current/static/error-message-reporting.html. I've probably been guilty of this in the past, it's a bit too easy to forget. Quite, I both knew this once and forgot it last night. There was some nagging in the back of my head that I was doing something wrong, but I couldn't place what. Happy this is committed, given that I've suggested relying upon it in the recovery.conf thread. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to allow users to kill their own queries
On 12/15/2011 07:36 PM, Josh Kupershmidt wrote: The only niggling concern I have about this patch (and, I think, all similar ones proposed) is the possible race condition between the permissions checking and the actual call of kill() inside pg_signal_backend(). This is a problem with the existing code though, and the proposed changes don't materially alter that; there's just another quick check in one path through. Right now we check if someone is superuser, then if it's a backend PID, then we send the signal. If you assume someone can run through all the PIDs between those checks and the kill, the system is already broken that way. I notice that BackendPidGetProc() has the comment: ... Note that it is up to the caller to be sure that the question remains meaningful for long enough for the answer to be used ... So someone has mulled this over before. It took my script maybe 15-20 minutes to cause a wraparound by running fork() in a loop, and that wraparound would somehow have to occur within the short execution of pg_signal_backend(). Right, the the possibility you're raising is the obvious flaw with this approach. Currently the only consumers of BackendPidGetProc or IsBackendPid are these cancel/terminate functions. As you measured, running through the PIDs fast enough to outrace any of that code is unlikely. I'm sure a lot of other UNIX apps would break, too, if that ever became untrue. The sort of thing that comment is warning is that you wouldn't want to do something like grab a PID, vacuum a table, and then assume that PID was still valid. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Measuring relation free space
On 12/15/2011 04:11 PM, Noah Misch wrote: Is one of those index figures simply wrong, or do they measure two senses of free space, both of which are interesting to DBAs? I think the bigger one--the one I was aiming to measure--also includes fill-factor space. It should be possible to isolate whether that's true by running the function against a fresh index, or by trying tests with a table where there's no useful fill. I need to add some of those to the test example suite. While in theory both measures of free space might be interesting to DBAs, I'd prefer to have the one that reflects the lost space to fill-factor if I'm checking an index. But as Robert Treat was pointing out, even the very rough estimates being made with existing user-space tools not using the contrib module features are helpful enough for a lot of users. So long as it's easy and accuracy is good enough to find bloated indexes, either implementation is probably good enough. Shaking out the alternate implementation ideas was really my goal for this CF here. The major goal of the next revision is to present the options with a measure of their respective accuracy and runtime. If I have to give up just a of bit of accuracy and make it much faster, that's probably what most people want as an option. When Jaime and I come back with an update, it really needs to have benchmarks and accuracy numbers for each option. That may be complicated a bit depending on how much of the table or index is cached, so isolating that out will be a pain. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Caching for stable expressions with constant arguments v3
I think what would be helpful here next is a self-contained benchmarking script. You posted an outline of what you did for the original version at http://archives.postgresql.org/message-id/cabrt9rc-1wgxzc_z5mwkdk70fgy2drx3slxzdp4vobkukpz...@mail.gmail.com It doesn't sound like it would be hard to turn that into a little shell script. I'd rather see one from you mainly so I don't have to worry about whether I duplicated your procedure correctly or not. Don't even worry about best of 3, just run it three times, grep out the line that shows the TPS number, and show them all. One of the first questions I have is whether the performance changed between there and your v5. A self-contained and fully automated performance test case would ease review, and give some performance regression testing for other suggested changes to use as a reference. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers