Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
On Sun, May 10, 2015 at 6:01 AM, Andrew Dunstan and...@dunslane.net wrote: This generally looks good, but I have a couple of questions before I commit it. First, why is the new option for the BASE_BACKUP command of the Streaming Replication protcol TAR? It seems rather misleading. Shouldn't it be something like TABLESPACEMAP? The reason to keep new option's name as TAR was that tablespace_map was generated for that format type, but I agree with you that something like TABLESPACEMAP suits better, so I have changed it to TABLESPACE_MAP. Putting '_' in name makes it somewhat consistent with other names and filename it generates with this new option. Second, these lines in xlog.c seem wrong: else if ((ch == '\n' || ch == '\r') prev_ch == '\\') str[i-1] = '\n'; It looks to me like we should be putting ch in the string, not arbitrarily transforming \r into \n. You are right, I have changed it as per your suggestion. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com extend_basebackup_to_include_symlink_v7.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/05/11 8:50, Tom Lane wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: [ EvalPlanQual-v6.patch ] I've started to study this in a little more detail, and I'm not terribly happy with some of the API decisions in it. Thanks for taking the time to review the patch! In particular, I find the addition of void *fdw_state to ExecRowMark to be pretty questionable. That does not seem like a good place to keep random state. (I realize that WHERE CURRENT OF keeps some state in ExecRowMark, but that's a crock not something to emulate.) ISTM that in most scenarios, the state that LockForeignRow/FetchForeignRow would like to get at is probably the FDW state associated with the ForeignScanState that the tuple came from. Which this API doesn't help with particularly. I wonder if we should instead add a ScanState* field and expect the core code to set that up (ExecOpenScanRelation could do it with minor API changes...). Sorry, I don't understand clearly what you mean, but that (the idea of expecting the core to set it up) sounds inconsistent with your comment on the earlier version of the API BeginForeignFetch [1]. I'm also a bit tempted to pass the TIDs to LockForeignRow and FetchForeignRow as Datums not ItemPointers. We have the Datum format available already at the call sites, so this is free as far as the core code is concerned, and would only cost another line or so for the FDWs. This is by no means sufficient to allow FDWs to use some other type than tid for row identifiers; but it would be a down payment on that problem, and at least would avoid nailing the rowids-are-tids assumption into yet another global API. That is a good idea. Also, as I mentioned, I'd be a whole lot happier if we had a way to test this... Attached is a postgres_fdw patch that I used for the testing. If you try it, edit postgresGetForeignRowMarkType as necessary. I have to confess that I did the testing only in the normal conditions by the patch. Sorry for the delay. I took a vacation until yesterday. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/14504.1428446...@sss.pgh.pa.us *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 88,93 typedef struct PgFdwRelationInfo --- 88,95 * * 1) SELECT statement text to be sent to the remote server * 2) Integer list of attribute numbers retrieved by the SELECT + * 3) SELECT statement text to be sent to the remote server + * 4) Integer list of attribute numbers retrieved by the SELECT * * These items are indexed with the enum FdwScanPrivateIndex, so an item * can be fetched with list_nth(). For example, to get the SELECT statement: *** *** 98,104 enum FdwScanPrivateIndex /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ ! FdwScanPrivateRetrievedAttrs }; /* --- 100,110 /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ ! FdwScanPrivateRetrievedAttrs, ! /* SQL statement to execute remotely (as a String node) */ ! FdwScanPrivateSelectSql2, ! /* Integer list of attribute numbers retrieved by SELECT */ ! FdwScanPrivateRetrievedAttrs2 }; /* *** *** 186,191 typedef struct PgFdwModifyState --- 192,223 } PgFdwModifyState; /* + * Execution state for fetching/locking foreign rows. + */ + typedef struct PgFdwFetchState + { + Relation rel; /* relcache entry for the foreign table */ + AttInMetadata *attinmeta; /* attribute datatype conversion metadata */ + + /* for remote query execution */ + PGconn *conn; /* connection for the fetch */ + char *p_name; /* name of prepared statement, if created */ + + /* extracted fdw_private data */ + char *query; /* text of SELECT command */ + List *retrieved_attrs; /* attr numbers retrieved by SELECT */ + + /* info about parameters for prepared statement */ + int p_nums; /* number of parameters to transmit */ + FmgrInfo *p_flinfo; /* output conversion functions for them */ + + HeapTuple locked_tuple; + + /* working memory context */ + MemoryContext temp_cxt; /* context for per-tuple temporary data */ + } PgFdwFetchState; + + /* * Workspace for analyzing a foreign table. */ typedef struct PgFdwAnalyzeState *** *** 276,281 static TupleTableSlot *postgresExecForeignDelete(EState *estate, --- 308,320 static void postgresEndForeignModify(EState *estate, ResultRelInfo *resultRelInfo); static int postgresIsForeignRelUpdatable(Relation rel); + static RowMarkType postgresGetForeignRowMarkType(LockClauseStrength strength); + static bool postgresLockForeignRow(EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + static HeapTuple
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
On Sun, May 10, 2015 at 11:07 PM, Andres Freund and...@anarazel.de wrote: On 2015-05-10 22:51:33 -0400, Robert Haas wrote: And there's definitely some things around that pretty much only still exist because changing them would break too much stuff. Such as what? Without even thinking about it: * linitial vs lfirst vs lnext. That thing still induces an impedance mismatch when reading code for me, and I believe a good number of other people. * Two 'string buffer' APIs with essentially only minor differences. * A whole bunch of libpq APIs. Admittedly that's a bit more exposed than lots of backend only things. * The whole V0 calling convention that makes it so much easier to get odd crashes. Admittedly that's all I could come up without having to think. But I do vaguely remember a lot of things we did not do because of bwcompat concerns. I see your point, but I don't think it really detracts from mine. The fact that we have a few inconsistently-named list functions is not preventing any core development project that would otherwise get completed to instead not get completed. Nor is any of that other stuff, except maybe the libpq API, but that's a lot (not just a bit) more exposed. Also, I'd actually be in favor of looking for a way to identify the StringInfo and PQexpBuffer stuff - and of partially deprecating the V0 calling convention. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaddressable bytes in BRIN
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: What I think this means is that during an index build brin_minmax_add_value() called numeric_lt() which detoasted one of its input values; later, brin_doinsert() inserts a tuple containing the value, but it tries to use more bytes than were allocated. I haven't had time to actually study what is going on here, but wanted to archive this publicly. (Value detoasting evidently plays a role here, but I don't know how.) I went ahead and added this to the 9.5 open items list. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)
Hi Andres. I've attached an updated patch for pgstatbloat, as well as a patch to replace two uses of BuildTupleFromCStrings() elsewhere in pgstattuple. I've made the changes you mentioned in your earlier mail, except that I have not changed the name pending further suggestions about what would be the best name. Also: At 2015-05-09 15:36:49 +0530, a...@2ndquadrant.com wrote: At 2015-05-09 02:20:51 +0200, and...@anarazel.de wrote: I haven't checked, but I'm not sure that it's safe/meaningful to call PageGetHeapFreeSpace() on a new page. OK, I'll check and fix if necessary. You're right, PageGetHeapFreeSpace() isn't safe on a new page. I've added a guard to that call in the attached patch, but I'm not sure that's the right thing to do. Should I copy the orphaned new-page handling from lazy_scan_heap? What about for empty pages? Neither feels like a very good thing to do, but then neither does skipping the new page silently. Should I count the space it would have free if it were initialised, but leave the page alone for VACUUM to deal with? Or just leave it as it is? -- Abhijit From 421267bba8394255feed8f9b9424d25d0be9f979 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen a...@2ndquadrant.com Date: Mon, 11 May 2015 15:54:48 +0530 Subject: Make pgstattuple use heap_form_tuple instead of BuildTupleFromCStrings --- contrib/pgstattuple/pgstatindex.c | 43 ++--- contrib/pgstattuple/pgstattuple.c | 45 ++- 2 files changed, 37 insertions(+), 51 deletions(-) diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c index a2ea5d7..608f729 100644 --- a/contrib/pgstattuple/pgstatindex.c +++ b/contrib/pgstattuple/pgstatindex.c @@ -257,7 +257,8 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo) { TupleDesc tupleDesc; int j; - char *values[10]; + Datum values[10]; + bool nulls[10]; HeapTuple tuple; /* Build a tuple descriptor for our result type */ @@ -265,33 +266,31 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo) elog(ERROR, return type must be a row type); j = 0; - values[j++] = psprintf(%d, indexStat.version); - values[j++] = psprintf(%d, indexStat.level); - values[j++] = psprintf(INT64_FORMAT, - (indexStat.root_pages + -indexStat.leaf_pages + -indexStat.internal_pages + -indexStat.deleted_pages + -indexStat.empty_pages) * BLCKSZ); - values[j++] = psprintf(%u, indexStat.root_blkno); - values[j++] = psprintf(INT64_FORMAT, indexStat.internal_pages); - values[j++] = psprintf(INT64_FORMAT, indexStat.leaf_pages); - values[j++] = psprintf(INT64_FORMAT, indexStat.empty_pages); - values[j++] = psprintf(INT64_FORMAT, indexStat.deleted_pages); + values[j++] = Int32GetDatum(indexStat.version); + values[j++] = Int32GetDatum(indexStat.level); + values[j++] = Int64GetDatum((indexStat.root_pages + + indexStat.leaf_pages + + indexStat.internal_pages + + indexStat.deleted_pages + + indexStat.empty_pages) * BLCKSZ); + values[j++] = Int32GetDatum(indexStat.root_blkno); + values[j++] = Int32GetDatum(indexStat.internal_pages); + values[j++] = Int32GetDatum(indexStat.leaf_pages); + values[j++] = Int32GetDatum(indexStat.empty_pages); + values[j++] = Int32GetDatum(indexStat.deleted_pages); if (indexStat.max_avail 0) - values[j++] = psprintf(%.2f, - 100.0 - (double) indexStat.free_space / (double) indexStat.max_avail * 100.0); + values[j++] = Float8GetDatum(100.0 - (double) indexStat.free_space / (double) indexStat.max_avail * 100.0); else - values[j++] = pstrdup(NaN); + values[j++] = CStringGetDatum(NaN); if (indexStat.leaf_pages 0) - values[j++] = psprintf(%.2f, - (double) indexStat.fragments / (double) indexStat.leaf_pages * 100.0); + values[j++] = Float8GetDatum((double) indexStat.fragments / (double) indexStat.leaf_pages * 100.0); else - values[j++] = pstrdup(NaN); + values[j++] = CStringGetDatum(NaN); - tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), - values); + for (j = 0; j 10; j++) + nulls[j] = false; + tuple = heap_form_tuple(tupleDesc, values, nulls); result = HeapTupleGetDatum(tuple); } diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c index c3a8b1d..552f188 100644 --- a/contrib/pgstattuple/pgstattuple.c +++ b/contrib/pgstattuple/pgstattuple.c @@ -85,28 +85,20 @@ static Datum build_pgstattuple_type(pgstattuple_type *stat, FunctionCallInfo fcinfo) { #define NCOLUMNS 9 -#define NCHARS 32 HeapTuple tuple; - char *values[NCOLUMNS]; - char values_buf[NCOLUMNS][NCHARS]; + Datum values[NCOLUMNS]; + bool nulls[NCOLUMNS]; int i; double tuple_percent; double dead_tuple_percent; double free_percent; /* free/reusable space in % */ TupleDesc tupdesc; - AttInMetadata *attinmeta; /* Build a tuple descriptor for our result type
Re: [HACKERS] multixacts woes
On Mon, May 11, 2015 at 12:56 AM, Noah Misch n...@leadboat.com wrote: On Sun, May 10, 2015 at 09:17:58PM -0400, Robert Haas wrote: On Sun, May 10, 2015 at 1:40 PM, Noah Misch n...@leadboat.com wrote: I don't know whether this deserves prompt remediation, but if it does, I would look no further than the hard-coded 25% figure. We permit users to operate close to XID wraparound design limits. GUC maximums force an anti-wraparound vacuum at no later than 93.1% of design capacity. XID assignment warns at 99.5%, then stops at 99.95%. PostgreSQL mandates a larger cushion for pg_multixact/offsets, with anti-wraparound VACUUM by 46.6% and a stop at 50.0%. Commit 53bb309d2d5a9432d2602c93ed18e58bd2924e15 introduced the bulkiest mandatory cushion yet, an anti-wraparound vacuum when pg_multixact/members is just 25% full. That's certainly one possible approach. I had discounted it because you can't really get more than a small multiple out of it, but getting 2-3x more room might indeed be enough to help some people quite a bit. Just raising the threshold from 25% to say 40% would buy back a healthy amount. Right. It's fair to assume that the new VACUUM burden would be discountable at a 90+% threshold, because the installations that could possibly find it expensive are precisely those experiencing corruption today. These reports took eighteen months to appear, whereas some corruption originating in commit 0ac5ad5 saw reports within three months. Therefore, sites burning pg_multixact/members proportionally faster than both pg_multixact/offsets and XIDs must be unusual. Bottom line: if we do need to reduce VACUUM burden caused by the commits you cited upthread, we almost certainly don't need more than a 4x improvement. I looked into the approach of adding a GUC called autovacuum_multixact_freeze_max_members to set the threshold. I thought to declare it this way: { + {autovacuum_multixact_freeze_max_members, PGC_POSTMASTER, AUTOVACUUM, + gettext_noop(# of multixact members at which autovacuum is forced to prevent multixact member wraparound.), + NULL + }, + autovacuum_multixact_freeze_max_members, + 20, 1000, 40, + NULL, NULL, NULL + }, Regrettably, I think that's not going to work, because 40 overflows int. We will evidently need to denote this GUC in some other units, unless we want to introduce config_int64. Given your concerns, and the need to get a fix for this out the door quickly, what I'm inclined to do for the present is go bump the threshold from 25% of MaxMultiXact to 50% of MaxMultiXact without changing anything else. Your analysis shows that this is more in line with the existing policy for multixact IDs than what I did, and it will reduce the threat of frequent wraparound scans. Now, it will also increase the chances of somebody hitting the wall before autovacuum can bail them out. But maybe not that much. If we need 75% of the multixact member space to complete one cycle of anti-wraparound vacuums, we're actually very close to the point where the system just cannot work. If that's one big table, we're done. Also, if somebody does have a workload where the auto-clamping doesn't provide them with enough headroom, they can still improve things by reducing autovacuum_multixact_freeze_max_age to a value less than the value to which we're auto-clamping it. If they need an effective value of less than 10 million they are out of luck, but if that is the case then there is a good chance that they are hosed anyway - an anti-wraparound vacuum every 10 million multixacts sounds awfully painful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
All, * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: if (lockmode == AccessShareLock) aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT); + else if (lockmode == RowExclusiveLock) + aclresult = pg_class_aclcheck(reloid, GetUserId(), +ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); else aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); Perhaps it would be better to refactor with a local variable for the aclmask and just one instance of the pg_class_aclcheck call. Also, I'm pretty sure that the documentation work needed is more extensive than the actual patch ;-). Otherwise, I don't see a problem with this. Now for a blast from the past... This came up again on IRC recently and reminded me that I ran into the same issue a couple years back. Updated patch includes the refactoring suggested and includes documentation. Not going to be back-patched, as discussed with Robert. Barring objections, I'll push this later today. Thanks! Stephen From d2b0fbc9fd8c0783f870fef3c901f7f8891c6e90 Mon Sep 17 00:00:00 2001 From: Stephen Frost sfr...@snowman.net Date: Mon, 11 May 2015 09:14:49 -0400 Subject: [PATCH] Allow LOCK TABLE .. ROW EXCLUSIVE MODE with INSERT INSERT acquires RowExclusiveLock during normal operation and therefore it makes sense to allow LOCK TABLE .. ROW EXCLUSIVE MODE to be executed by users who have INSERT rights on a table (even if they don't have UPDATE or DELETE). Not back-patching this as it's a behavior change which, strictly speaking, loosens security restrictions. Per discussion with Tom and Robert (circa 2013). --- doc/src/sgml/ref/lock.sgml | 8 +--- src/backend/commands/lockcmds.c | 12 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml index 913afe7..b946eab 100644 --- a/doc/src/sgml/ref/lock.sgml +++ b/doc/src/sgml/ref/lock.sgml @@ -161,9 +161,11 @@ LOCK [ TABLE ] [ ONLY ] replaceable class=PARAMETERname/replaceable [ * ] para literalLOCK TABLE ... IN ACCESS SHARE MODE/ requires literalSELECT/ -privileges on the target table. All other forms of commandLOCK/ -require table-level literalUPDATE/, literalDELETE/, or -literalTRUNCATE/ privileges. +privileges on the target table. literalLOCK TABLE ... IN ROW EXCLUSIVE +MODE/ requires literalINSERT/, literalUPDATE/, literalDELETE/, +or literalTRUNCATE/ privileges on the target table. All other forms of +commandLOCK/ require table-level literalUPDATE/, literalDELETE/, +or literalTRUNCATE/ privileges. /para para diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c index bdec2ff..a167082 100644 --- a/src/backend/commands/lockcmds.c +++ b/src/backend/commands/lockcmds.c @@ -169,13 +169,17 @@ static AclResult LockTableAclCheck(Oid reloid, LOCKMODE lockmode) { AclResult aclresult; + AclMode aclmask; /* Verify adequate privilege */ if (lockmode == AccessShareLock) - aclresult = pg_class_aclcheck(reloid, GetUserId(), - ACL_SELECT); + aclmask = ACL_SELECT; + else if (lockmode == RowExclusiveLock) + aclmask = ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE; else - aclresult = pg_class_aclcheck(reloid, GetUserId(), - ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); + aclmask = ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE; + + aclresult = pg_class_aclcheck(reloid, GetUserId(), aclmask); + return aclresult; } -- 1.9.1 signature.asc Description: Digital signature
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: On 2015/05/11 8:50, Tom Lane wrote: In particular, I find the addition of void *fdw_state to ExecRowMark to be pretty questionable. That does not seem like a good place to keep random state. (I realize that WHERE CURRENT OF keeps some state in ExecRowMark, but that's a crock not something to emulate.) ISTM that in most scenarios, the state that LockForeignRow/FetchForeignRow would like to get at is probably the FDW state associated with the ForeignScanState that the tuple came from. Which this API doesn't help with particularly. I wonder if we should instead add a ScanState* field and expect the core code to set that up (ExecOpenScanRelation could do it with minor API changes...). Sorry, I don't understand clearly what you mean, but that (the idea of expecting the core to set it up) sounds inconsistent with your comment on the earlier version of the API BeginForeignFetch [1]. Well, the other way that it could work would be for the FDW's BeginForeignScan routine to search for a relevant ExecRowMark entry (which, per that previous discussion, it'd likely need to do anyway) and then plug a back-link to the ForeignScanState into the ExecRowMark. But I don't see any very good reason to make every FDW that's concerned with this do that, rather than doing it once in the core code. I'm also thinking that having a link to the source scan node there might be useful someday for regular tables as well as FDWs. Also, as I mentioned, I'd be a whole lot happier if we had a way to test this... Attached is a postgres_fdw patch that I used for the testing. If you try it, edit postgresGetForeignRowMarkType as necessary. I have to confess that I did the testing only in the normal conditions by the patch. Thanks, this is helpful. However, it pretty much proves my point about wanting the back-link --- it seems like init_foreign_fetch_state, for example, is uselessly repeating a lot of the setup already done for the scan node itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tzdata and 9.4.2, etc.
On 4 May 2015 at 22:42, David Fetter da...@fetter.org wrote: Folks, We are now three tzdata changes behind. There are bugs in pg_dump which create real restore errors for people using PostGIS, one of our most popular extensions. Can we please wrap and ship 9.4.2, etc., and do it soon? +1 -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multixacts woes
On Mon, May 11, 2015 at 08:29:05AM -0400, Robert Haas wrote: Given your concerns, and the need to get a fix for this out the door quickly, what I'm inclined to do for the present is go bump the threshold from 25% of MaxMultiXact to 50% of MaxMultiXact without changing anything else. +1 Your analysis shows that this is more in line with the existing policy for multixact IDs than what I did, and it will reduce the threat of frequent wraparound scans. Now, it will also increase the chances of somebody hitting the wall before autovacuum can bail them out. But maybe not that much. If we need 75% of the multixact member space to complete one cycle of anti-wraparound vacuums, we're actually very close to the point where the system just cannot work. If that's one big table, we're done. Agreed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tzdata and 9.4.2, etc.
On Mon, May 11, 2015 at 10:13 AM, Thom Brown t...@linux.com wrote: We are now three tzdata changes behind. There are bugs in pg_dump which create real restore errors for people using PostGIS, one of our most popular extensions. Can we please wrap and ship 9.4.2, etc., and do it soon? +1 I'm more concerned about the latest round of multixact bugs than I am about the tzdata, but I think that's at a point now where we can go ahead and schedule a release. And I think we should. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
I wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: On 2015/05/11 8:50, Tom Lane wrote: I wonder if we should instead add a ScanState* field and expect the core code to set that up (ExecOpenScanRelation could do it with minor API changes...). Sorry, I don't understand clearly what you mean, but that (the idea of expecting the core to set it up) sounds inconsistent with your comment on the earlier version of the API BeginForeignFetch [1]. Well, the other way that it could work would be for the FDW's BeginForeignScan routine to search for a relevant ExecRowMark entry (which, per that previous discussion, it'd likely need to do anyway) and then plug a back-link to the ForeignScanState into the ExecRowMark. But I don't see any very good reason to make every FDW that's concerned with this do that, rather than doing it once in the core code. I'm also thinking that having a link to the source scan node there might be useful someday for regular tables as well as FDWs. And on the third hand ... in view of the custom/foreign join pushdown stuff that just went in, it would be a mistake to assume that there necessarily is a distinct source scan node associated with each ExecRowMark. The FDW can presumably find all the ExecRowMarks associated with the rels that a join ForeignScan is scanning, but we can't assume that ExecOpenScanRelation will be able to set up those links, and the FDW might not want a simple link to the join scan node anyway. So it's probably best to leave it as an unspecified void* instead of trying to nail down the meaning more precisely. I still don't much like calling it fdw_state though, because I don't think it should be documented as only for the use of FDWs. (Custom scans might have a use for a pointer field here too, for example.) Maybe just call it void *extra and document it as being for the use of whatever plan node is sourcing the relation's tuples. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres GSSAPI Encryption
Stephen Frost sfr...@snowman.net writes: Robbie, * Robbie Harwood (rharw...@redhat.com) wrote: We'd I think also want a new kind of HBA entry (probably something along the lines of `hostgss` to contrast with `hostssl`), but I'm not sure what we'd want to do for the counterpart of `hostnossl` (`hostnogss`? But then do we need `hostnogssnossl`? Is this even a useful setting to have?), so that will probably require broader discussion. Are you intending to support GSSAPI encryption *without* using the GSSAPI authentication mechanism? If not, maybe we can come up with a way to have an option to the GSSAPI auth mechanism that behaves the way we want for GSSAPI encryption. Perhaps something like: encryption = (none | request | require) If you need an option for it to still be able to fall-thru to the next pg_hba line, that'd be more difficult, but is that really a sensible use-case? That's a good point. I don't see GSSAPI encryption without GSSAPI authentication as a particularly compelling use case, so a setting like that (with default presumably to request for backward compatibility) seems perfect. As for fall-through on failure, I don't really know what's reasonable here. My understanding of the way it works currently suggests that it would be *expected* to fall-through based on the way other things behave, though it's certainly less effort on my part if it does not. Finally, I think there are a couple different ways the protocol could look. I'd ideally like to opportunistically encrypt all GSSAPI-authenticated connections and fallback to non-encrypted when the other end doesn't support it (possibly dropping the connection if this causes it to not match HBA), but I'm not sure if this will work with the existing code. A (less-nice) option could be to add a new authentication method for gss-encryption, which feels aesthetically misplaced. The approach used for SSL today could probably be adopted as a third approach, but I don't really see a gain from doing it this way since encryption happens after authentication (opposite of SSL) in GSSAPI. I'd definitely like to see us opportunistically encrypt all GSSAPI authenticated connections also. The main case to consider is how to support migrating users who are currently using GSSAPI + SSL to GSSAPI auth+encryption, and how to support them if they want to continue to use GSSAPI just for user auth and use SSL for encryption. So if we go with the encryption option, we might not need a specific entry for GSS hosts. For the SSL encryption/GSS auth case, we'd want it to work the way it does now where you specify hostssl and then gss as the method. Then for the GSS for auth and encryption, one would use a normal host entry, then specify gss as the method, and then set encryption=require. One consequence of this would be that you can do double encryption by specifying hostssl, then gss with encrypt = require. I don't think this is something more desirable than just one or the other, but unless it's actually a problem (and I don't think it is) to have around, I think the setup would work nicely. We could also default encrypt to none when hostssl is specified if it is a problem. Thanks! --Robbie signature.asc Description: PGP signature
Re: [HACKERS] LOCK TABLE Permissions
On Mon, May 11, 2015 at 10:32 PM, Stephen Frost sfr...@snowman.net wrote: Now for a blast from the past... This came up again on IRC recently and reminded me that I ran into the same issue a couple years back. Updated patch includes the refactoring suggested and includes documentation. Barring objections, I'll push this later today. Small suggestion: a test case in src/test/isolation? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
* Michael Paquier (michael.paqu...@gmail.com) wrote: On Mon, May 11, 2015 at 10:32 PM, Stephen Frost sfr...@snowman.net wrote: Now for a blast from the past... This came up again on IRC recently and reminded me that I ran into the same issue a couple years back. Updated patch includes the refactoring suggested and includes documentation. Barring objections, I'll push this later today. Small suggestion: a test case in src/test/isolation? This is entirely a permissions-related change and src/test/isolation is for testing concurrent behavior, not about testing permissions. I'm not saying that we shouldn't have more tests there, but it'd not be appropriate for this particular patch. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] initdb start server recommendation
On Fri, May 1, 2015 at 10:14:13AM -0400, Bruce Momjian wrote: Currently initdb outputs suggested text on starting the server: Success. You can now start the database server using: /u/pgsql/bin/postgres -D /u/pgsql/data or /u/pgsql/bin/pg_ctl -D /u/pgsql/data -l logfile start I am now thinking pg_ctl should be recommended first. At the time this text was written pg_ctl was new. I have applied a patch to initdb to only recommend the pg_ctl method. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)
At 2015-05-11 19:15:47 +0200, and...@anarazel.de wrote: TBH, I'd rather not touch unrelated things right now. We're pretty badly behind... OK. That patch is independent; just ignore it. I don't really care how it's named, as long as it makes clear that it's not an exact measurement. Not having heard any better suggestions, I picked pgstatapprox as a compromise between length and familiarity/consistency with pgstattuple. Should I count the space it would have free if it were initialised, but leave the page alone for VACUUM to deal with? And this is what the attached patch does. I also cleaned up a few things that I didn't like but had left alone to make the code look similar to pgstattuple. In particular, build_tuple() now does nothing but build a tuple from values calculated earlier in pgstatapprox_heap(). Thank you. -- Abhijit P.S. What, if anything, should be done about the complicated and likely not very useful skip-only-min#-blocks logic in lazy_scan_heap? From 0a99fc61d36e3a3ff4003db95e5c299a5f07a850 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen a...@2ndquadrant.com Date: Fri, 26 Dec 2014 12:37:13 +0530 Subject: Add pgstatapprox to pgstattuple --- contrib/pgstattuple/Makefile | 4 +- contrib/pgstattuple/pgstatapprox.c | 277 + contrib/pgstattuple/pgstattuple--1.2--1.3.sql | 18 ++ .../{pgstattuple--1.2.sql = pgstattuple--1.3.sql} | 18 +- contrib/pgstattuple/pgstattuple.control| 2 +- doc/src/sgml/pgstattuple.sgml | 135 ++ 6 files changed, 450 insertions(+), 4 deletions(-) create mode 100644 contrib/pgstattuple/pgstatapprox.c create mode 100644 contrib/pgstattuple/pgstattuple--1.2--1.3.sql rename contrib/pgstattuple/{pgstattuple--1.2.sql = pgstattuple--1.3.sql} (72%) diff --git a/contrib/pgstattuple/Makefile b/contrib/pgstattuple/Makefile index 862585c..6083dab 100644 --- a/contrib/pgstattuple/Makefile +++ b/contrib/pgstattuple/Makefile @@ -1,10 +1,10 @@ # contrib/pgstattuple/Makefile MODULE_big = pgstattuple -OBJS = pgstattuple.o pgstatindex.o $(WIN32RES) +OBJS = pgstattuple.o pgstatindex.o pgstatapprox.o $(WIN32RES) EXTENSION = pgstattuple -DATA = pgstattuple--1.2.sql pgstattuple--1.1--1.2.sql pgstattuple--1.0--1.1.sql pgstattuple--unpackaged--1.0.sql +DATA = pgstattuple--1.3.sql pgstattuple--1.2--1.3.sql pgstattuple--1.1--1.2.sql pgstattuple--1.0--1.1.sql pgstattuple--unpackaged--1.0.sql PGFILEDESC = pgstattuple - tuple-level statistics REGRESS = pgstattuple diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c new file mode 100644 index 000..46f5bc0 --- /dev/null +++ b/contrib/pgstattuple/pgstatapprox.c @@ -0,0 +1,277 @@ +/* + * contrib/pgstattuple/pgstatapprox.c + * Fast bloat estimation functions + */ + +#include postgres.h + +#include access/visibilitymap.h +#include access/transam.h +#include access/xact.h +#include access/multixact.h +#include access/htup_details.h +#include catalog/namespace.h +#include funcapi.h +#include miscadmin.h +#include storage/bufmgr.h +#include storage/freespace.h +#include storage/procarray.h +#include storage/lmgr.h +#include utils/builtins.h +#include utils/tqual.h +#include commands/vacuum.h + +PG_FUNCTION_INFO_V1(pgstatapprox); + +typedef struct output_type +{ + uint64 table_len; + uint64 scanned_percent; + uint64 tuple_count; + uint64 tuple_len; + double tuple_percent; + uint64 dead_tuple_count; + uint64 dead_tuple_len; + double dead_tuple_percent; + uint64 free_space; + double free_percent; +} output_type; + +#define NUM_OUTPUT_COLUMNS 10 + +/* + * This function takes an already open relation and scans its pages, + * skipping those that have the corresponding visibility map bit set. + * For pages we skip, we find the free space from the free space map + * and approximate tuple_len on that basis. For the others, we count + * the exact number of dead tuples etc. + * + * This scan is loosely based on vacuumlazy.c:lazy_scan_heap(), but + * we do not try to avoid skipping single pages. + */ + +static void +pgstatapprox_heap(Relation rel, output_type *stat) +{ + BlockNumber scanned, +nblocks, +blkno; + Buffer vmbuffer = InvalidBuffer; + BufferAccessStrategy bstrategy; + TransactionId OldestXmin; + uint64 misc_count = 0; + + OldestXmin = GetOldestXmin(rel, true); + bstrategy = GetAccessStrategy(BAS_BULKREAD); + + nblocks = RelationGetNumberOfBlocks(rel); + scanned = 0; + + for (blkno = 0; blkno nblocks; blkno++) + { + Buffer buf; + Page page; + OffsetNumber offnum, + maxoff; + Size freespace; + + CHECK_FOR_INTERRUPTS(); + + /* + * If the page has only visible tuples, then we can find out the + * free space from the FSM and move on. + */ + + if (visibilitymap_test(rel, blkno, vmbuffer)) + { + freespace = GetRecordedFreeSpace(rel, blkno); + stat-tuple_len += BLCKSZ - freespace; + stat-free_space += freespace; + continue; + } + + buf
[HACKERS] Multixid hindsight design
I'd like to discuss how we should've implemented the infamous 9.3 multixid/row-locking stuff, and perhaps still should in 9.6. Hindsight is always 20/20 - I'll readily admit that I didn't understand the problems until well after the release - so this isn't meant to bash what's been done. Rather, let's think of the future. The main problem with the infamous multixid changes was that it made pg_multixact a permanent, critical, piece of data. Without it, you cannot decipher whether some rows have been deleted or not. The 9.3 changes uncovered pre-existing issues with vacuuming and wraparound, but the fact that multixids are now critical turned those the otherwise relatively harmless bugs into data loss. We have pg_clog, which is a similar critical data structure. That's a pain too - you need VACUUM and you can't easily move tables from one cluster to another for example - but we've learned to live with it. But we certainly don't need any more such data structures. So the lesson here is that having a permanent pg_multixact is not nice, and we should get rid of it. Here's how to do that: Looking at the tuple header, the CID and CTID fields are only needed, when either xmin or xmax is running. Almost: in a HOT-updated tuple, CTID is required even after xmax has committed, but since it's a HOT update, the new tuple is always on the same page so you only need the offsetnumber part. That leaves us with 8 bytes that are always available for storing ephemeral information. By ephemeral, I mean that it is only needed when xmin or xmax is in-progress. After that, e.g. after a shutdown, it's never looked at. Let's add a new SLRU, called Tuple Ephemeral Data (TED). It is addressed by a 64-bit pointer, which means that it never wraps around. That 64-bit pointer is stored in the tuple header, in those 8 ephemeral bytes currently used for CID and CTID. Whenever a tuple is deleted/updated and locked at the same time, a TED entry is created for it, in the new SLRU, and the pointer to the entry is put on the tuple. In the TED entry, we can use as many bytes as we need to store the ephemeral data. It would include the CID (or possibly both CMIN and CMAX separately, now that we have the space), CTID, and the locking XIDs. The list of locking XIDs could be stored there directly, replacing multixids completely, or we could store a multixid there, and use the current pg_multixact system to decode them. Or we could store the multixact offset in the TED, replacing the multixact offset SLRU, but keep the multixact member SLRU as is. The XMAX stored on the tuple header would always be a real transaction ID, not a multixid. Hence locked-only tuples don't need to be frozen afterwards. The beauty of this would be that the TED entries can be zapped at restart, just like pg_subtrans, and pg_multixact before 9.3. It doesn't need to be WAL-logged, and we are free to change its on-disk layout even in a minor release. Further optimizations are possible. If the TED entry fits in 8 bytes, it can be stored directly in the tuple header. Like today, if a tuple is locked but not deleted/updated, you only need to store the locker XID, and you can store the locking XID directly on the tuple. Or if it's deleted and locked, CTID is not needed, only CID and locker XID, so you can store those direcly on the tuple. Plus some spare bits to indicate what is stored. And if the XMIN is older than global-xmin, you could also steal the XMIN field for storing TED data, making it possible to store 12 bytes directly in the tuple header. Plus some spare bits again to indicate that you've done that. Now, given where we are, how do we get there? Upgrade is a pain, because even if we no longer generate any new multixids, we'll have to be able to decode them after pg_upgrade. Perhaps condense pg_multixact into a simpler pg_clog-style bitmap at pg_upgrade, to make it small and simple to read, but it would nevertheless be a fair amount of code just to deal with pg_upgraded databases. I think this is worth doing, even after we've fixed all the acute multixid bugs, because this would be more robust in the long run. It would also remove the need to do anti-wraparound multixid vacuums, and the newly-added tuning knobs related to that. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On 05/12/2015 12:00 AM, Bruce Momjian wrote: Multi-xacts were made durable in Postgres 9.3 (released 2013-09-09) to allow primary-key-column-only locks. 1.7 years later, we are still dealing with bugs related to this feature. Obviously, something is wrong. There were many 9.3 minor releases containing multi-xacts fixes, and these fixes have extended into 9.4. After the first few bug-fix releases, I questioned whether we needed to revert or rework the feature, but got no positive response. Only in the past few weeks have we got additional people involved. The revert or rework ship had already sailed at that point. I don't think we had much choice than just soldier through the bugs after the release. I think we now know that our inaction didn't serve us well. The question is how can we identify chronic problems and get resources involved sooner. I feel we have been asleep at the wheel to some extent on this. Yeah. I think the problem was that no-one realized that this was a significant change to the on-disk format. It was deceptively backwards-compatible. When it comes to permanent on-disk structures, we should all be more vigilant in the review. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On 05/11/2015 02:00 PM, Bruce Momjian wrote: I think we now know that our inaction didn't serve us well. The question is how can we identify chronic problems and get resources involved sooner. I feel we have been asleep at the wheel to some extent on this. Here are some options Slow down the release cycle The shortness of the release cycle puts a preference on adding features versus providing a more mature outcome. or Increase the release cycle Moving to a Ubuntu style release cycle would allow us to have a window to scratch the itch but with the eventual (and known) release of something that is LTS. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On Mon, May 11, 2015 at 02:11:48PM -0700, Joshua Drake wrote: On 05/11/2015 02:00 PM, Bruce Momjian wrote: I think we now know that our inaction didn't serve us well. The question is how can we identify chronic problems and get resources involved sooner. I feel we have been asleep at the wheel to some extent on this. Here are some options Slow down the release cycle The shortness of the release cycle puts a preference on adding features versus providing a more mature outcome. or Increase the release cycle Moving to a Ubuntu style release cycle would allow us to have a window to scratch the itch but with the eventual (and known) release of something that is LTS. The releases themselves are not the problem, but rather the volume of bugs and our slowness in getting additional people involved to remove data corruption bugs more quickly and systematically. Our reputation for reliability has been harmed by this inactivity. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #13148: Unexpected deferred EXCLUDE constraint violation on derived table
On Mon, May 11, 2015 at 9:47 AM, Andres Freund and...@anarazel.de wrote: On 2015-05-10 16:01:53 -0400, Tom Lane wrote: The cause of the problem seems to be that the UPDATE performs a HOT update of the new tuple, leaving in this case a dead tuple at (0,2) that is HOT updated by (0,3). When unique_key_recheck() is invoked for (0,2), it believes, correctly, that it has to perform the recheck anyway ... but it tells check_exclusion_constraint that the check is being performed for (0,2). So the index search inside check_exclusion_constraint finds the live tuple at (0,3) and thinks that is a conflict. Heh, it's curious that this wasn't found up until now. I also wonder if this might be related to the spurious violations Peter G. has been observing... I don't think so. Speculative insertion relies on the assumption that the speculatively inserted tuple isn't MVCC visible to other sessions. I actually prototyped an implementation that avoided the historic unprincipled deadlocks of exclusion constraints (a known limitation since they were added), by making *UPDATE* also do a speculative insertion, and by making even non-UPSERT INSERTs insert speculatively. This almost worked, but when time came to back out of a speculative insertion on an UPDATE due to a conflict from a concurrent session, the implementation couldn't handle it - it was just a mess to try and figure out how that was supposed to work with heap_update(), and so that prototype was scrapped. For the benefit of those not closely involved in the ON CONFLICT project, I should point out that ON CONFLICT will not accept a deferred index as an arbiter index. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Multi-xacts and our process problem
Multi-xacts were made durable in Postgres 9.3 (released 2013-09-09) to allow primary-key-column-only locks. 1.7 years later, we are still dealing with bugs related to this feature. Obviously, something is wrong. There were many 9.3 minor releases containing multi-xacts fixes, and these fixes have extended into 9.4. After the first few bug-fix releases, I questioned whether we needed to revert or rework the feature, but got no positive response. Only in the past few weeks have we got additional people involved. I think we now know that our inaction didn't serve us well. The question is how can we identify chronic problems and get resources involved sooner. I feel we have been asleep at the wheel to some extent on this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
All, * Stephen Frost (sfr...@snowman.net) wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: if (lockmode == AccessShareLock) aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT); + else if (lockmode == RowExclusiveLock) + aclresult = pg_class_aclcheck(reloid, GetUserId(), +ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); else aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); Perhaps it would be better to refactor with a local variable for the aclmask and just one instance of the pg_class_aclcheck call. Also, I'm pretty sure that the documentation work needed is more extensive than the actual patch ;-). Otherwise, I don't see a problem with this. Now for a blast from the past... This came up again on IRC recently and reminded me that I ran into the same issue a couple years back. Updated patch includes the refactoring suggested and includes documentation. Not going to be back-patched, as discussed with Robert. Barring objections, I'll push this later today. Done, finally. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] LOCK TABLE Permissions
On Tue, May 12, 2015 at 4:01 AM, Stephen Frost sfr...@snowman.net wrote: * Michael Paquier (michael.paqu...@gmail.com) wrote: On Mon, May 11, 2015 at 10:32 PM, Stephen Frost sfr...@snowman.net wrote: Now for a blast from the past... This came up again on IRC recently and reminded me that I ran into the same issue a couple years back. Updated patch includes the refactoring suggested and includes documentation. Barring objections, I'll push this later today. Small suggestion: a test case in src/test/isolation? This is entirely a permissions-related change and src/test/isolation is for testing concurrent behavior, not about testing permissions. I'm not saying that we shouldn't have more tests there, but it'd not be appropriate for this particular patch. Perhaps. Note that we could have tests of this type though in lock.sql: create role foo login; create table aa (a int); grant insert on aa TO foo; \c postgres foo begin; insert into aa values (1); lock table aa in row exclusive mode; -- should pass Just mentioning it for the sake of the archives, I cannot work on that for now. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On Thu, May 7, 2015 at 4:57 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Thu, May 7, 2015 at 11:02 AM, Stephen Frost sfr...@snowman.net wrote: I realize it's not going to be popular, but I'd love to have 'trust' only allowed if a command-line option is passed to the postmaster or something along those lines. It's really got no business being an option for a network service like PG. I disagree wholeheartedly. There is such a thing as a trusted network. Likely a good topic of conversation to be had in Ottawa. :) I agree that there are trusted networks, but the ones that I work with still expect network services to require authentication and authorization. Perhaps they're not really trusted then, from your perspective. On the other hand, I suppose if you use pg_hba to limit which accounts can be logged into with 'trust' then you might be able to have, say, a read-only user/database that anyone could see. That's a pretty narrow case though and I'd rather we figure out how to address it directly and more specifically (no-password login roles?) than the broad disable-all-authentication trust method. Let's suppose that you have an application server and a DB server running on the same node. That turns out to be too much load, so you move the application server to a separate machine and connect the two machines with a crossover cable, or a VLAN that has nothing else on it. To me, it's quite sane to want connections on that network to proceed without authentication or authorization. If you've got to open up the database more than that then, yes, you need authentication and authorization. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
* Michael Paquier (michael.paqu...@gmail.com) wrote: On Tue, May 12, 2015 at 4:01 AM, Stephen Frost sfr...@snowman.net wrote: * Michael Paquier (michael.paqu...@gmail.com) wrote: On Mon, May 11, 2015 at 10:32 PM, Stephen Frost sfr...@snowman.net wrote: Now for a blast from the past... This came up again on IRC recently and reminded me that I ran into the same issue a couple years back. Updated patch includes the refactoring suggested and includes documentation. Barring objections, I'll push this later today. Small suggestion: a test case in src/test/isolation? This is entirely a permissions-related change and src/test/isolation is for testing concurrent behavior, not about testing permissions. I'm not saying that we shouldn't have more tests there, but it'd not be appropriate for this particular patch. Perhaps. Note that we could have tests of this type though in lock.sql: create role foo login; create table aa (a int); grant insert on aa TO foo; \c postgres foo begin; insert into aa values (1); lock table aa in row exclusive mode; -- should pass Yeah, it might not be bad to have tests for all the different lock types and make sure that the permissions match up. I'd probably put those tests into 'permissions.sql' instead though. Just mentioning it for the sake of the archives, I cannot work on that for now. Ditto. I'm trying to work through the postgres_fdw UPDATE push-down patch now.. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] deparsing utility commands
On 5/8/15 3:29 PM, Alvaro Herrera wrote: Here is a complete version. Barring serious problems, I intend to commit this on Monday. I have reviewed and tested this patch and everything looks good to me. It also looks like you added better coverage for schema DDL, which is a welcome addition. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] multixacts woes
On 05/11/2015 10:24 AM, Josh Berkus wrote: In terms of adding a new GUC in 9.5: can't we take a stab at auto-tuning this instead of adding a new GUC? We already have a bunch of freezing GUCs which fewer than 1% of our user base has any idea how to set. That is a documentation problem not a user problem. Although I agree that yet another GUC for an obscure feature that should be internally intelligent is likely the wrong direction. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
On 05/11/2015 02:02 AM, Amit Kapila wrote: On Sun, May 10, 2015 at 6:01 AM, Andrew Dunstan and...@dunslane.net mailto:and...@dunslane.net wrote: This generally looks good, but I have a couple of questions before I commit it. First, why is the new option for the BASE_BACKUP command of the Streaming Replication protcol TAR? It seems rather misleading. Shouldn't it be something like TABLESPACEMAP? The reason to keep new option's name as TAR was that tablespace_map was generated for that format type, but I agree with you that something like TABLESPACEMAP suits better, so I have changed it to TABLESPACE_MAP. Putting '_' in name makes it somewhat consistent with other names and filename it generates with this new option. Second, these lines in xlog.c seem wrong: else if ((ch == '\n' || ch == '\r') prev_ch == '\\') str[i-1] = '\n'; It looks to me like we should be putting ch in the string, not arbitrarily transforming \r into \n. You are right, I have changed it as per your suggestion. OK, I have cleaned this up a bit - I had already started so I didn't take your latest patch but instead applied relevant changes to my changeset. Here is my latest version. In testing I notice that now pg_baseback -F t leaves it completely up to the user on all platforms to create the relevant links in pg_tblspc/. It includes the tablespace_map file in base.tar, but that's really just informational. I think we need to add something to the pg_basebackup docs about that, at the very least (and it will also need to be a release note item.) cheers andrew diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index e25e0d0..def43a2 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -836,8 +836,11 @@ SELECT pg_start_backup('label'); functionpg_start_backup/ creates a firsttermbackup label/ file, called filenamebackup_label/, in the cluster directory with information about your backup, including the start time and label - string. The file is critical to the integrity of the backup, should - you need to restore from it. + string. The function also creates a firsttermtablespace map/ file, + called filenametablespace_map/, in the cluster directory with + information about tablespace symbolic links in filenamepg_tblspc// + if one or more such link is present. Both files are critical to the + integrity of the backup, should you need to restore from it. /para para @@ -965,17 +968,20 @@ SELECT pg_stop_backup(); para It's also worth noting that the functionpg_start_backup/ function -makes a file named filenamebackup_label/ in the database cluster -directory, which is removed by functionpg_stop_backup/. -This file will of course be archived as a part of your backup dump file. -The backup label file includes the label string you gave to -functionpg_start_backup/, as well as the time at which -functionpg_start_backup/ was run, and the name of the starting WAL -file. In case of confusion it is therefore possible to look inside a -backup dump file and determine exactly which backup session the dump file -came from. However, this file is not merely for your information; its -presence and contents are critical to the proper operation of the system's -recovery process. +makes files named filenamebackup_label/ and +filenametablesapce_map/ in the database cluster directory, +which are removed by functionpg_stop_backup/. These files will of +course be archived as a part of your backup dump file. The backup label +file includes the label string you gave to functionpg_start_backup/, +as well as the time at which functionpg_start_backup/ was run, and +the name of the starting WAL file. In case of confusion it is therefore +possible to look inside a backup dump file and determine exactly which +backup session the dump file came from. The tablespace map file includes +the symbolic link names as they exist in the directory +filenamepg_tblspc// and the full path of each symbolic link. +These files are not merely for your information; their presence and +contents are critical to the proper operation of the system's recovery +process. /para para diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index fb39731..24d43d9 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16591,11 +16591,12 @@ SELECT set_config('log_statement_stats', 'off', false); functionpg_start_backup/ accepts an arbitrary user-defined label for the backup. (Typically this would be the name under which the backup dump file will be stored.) The function -writes a backup label file (filenamebackup_label/) into the -database cluster's data directory, performs a checkpoint, -and then returns the backup's starting transaction log location as text. -The user
Re: [HACKERS] Fixing busted citext function declarations
On May 11, 2015, at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Me too. Something fell through the cracks rather badly there :-(. Would you check your commit history to see if anything else got missed? Let’s see… In https://github.com/theory/citext/commit/4030b4e1ad9fd9f994a6cdca1126a903682acae4 I copied your use of specifying the full path to pg_catalog function, which is still in core. In https://github.com/theory/citext/commit/c24132c098a822f5a8669ed522e747e01e1c0835, I made some tweaks based on you change you made to some version of my patch. Most are minor, or just for functions needed for 8.4 and not later versions. In https://github.com/theory/citext/commit/2c7e997fd60e2b708d06c128e5fd2db51c7a9f33, I added a cast to bpchar, which is in core. In https://github.com/theory/citext/commit/cf988024d18a6ddd9a8146ab8cabfe6e0167ba26 and https://github.com/theory/citext/commit/22f91a0d50003a0c1c27d1fbf0bb5c0a1e3a3cad I switched from VARSIZE_ANY_EXHDR() to strlen() at your suggestion. Also still there. Anyway, those are all from 2008 and pretty much just copy changes you made to core. The return value of regexp_matches() is the only significant change since then. So I think we’re good. Best, David\ smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Multi-xacts and our process problem
On Tue, May 12, 2015 at 12:29:56AM +0300, Heikki Linnakangas wrote: On 05/12/2015 12:00 AM, Bruce Momjian wrote: Multi-xacts were made durable in Postgres 9.3 (released 2013-09-09) to allow primary-key-column-only locks. 1.7 years later, we are still dealing with bugs related to this feature. Obviously, something is wrong. There were many 9.3 minor releases containing multi-xacts fixes, and these fixes have extended into 9.4. After the first few bug-fix releases, I questioned whether we needed to revert or rework the feature, but got no positive response. Only in the past few weeks have we got additional people involved. The revert or rework ship had already sailed at that point. I True. don't think we had much choice than just soldier through the bugs after the release. The problem is we soldiered on without adding any resources to the problem or doing a systematic review once it became clear one was necessary. I think we now know that our inaction didn't serve us well. The question is how can we identify chronic problems and get resources involved sooner. I feel we have been asleep at the wheel to some extent on this. Yeah. I think the problem was that no-one realized that this was a significant change to the on-disk format. It was deceptively backwards-compatible. When it comes to permanent on-disk structures, we should all be more vigilant in the review. Yes, and the size/age of the patch helped mask problems too. Are these the lessons we need to learn? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-04-30 05:35:26 +0100, Andrew Gierth wrote: Andres == Andres Freund and...@anarazel.de writes: Andres This is not a real review. I'm just scanning through the Andres patch, without reading the thread, to understand if I see Andres something worthy of controversy. While scanning I might have Andres a couple observations or questions. + * A list of grouping sets which is structurally equivalent to a ROLLUP + * clause (e.g. (a,b,c), (a,b), (a)) can be processed in a single pass over + * ordered data. We do this by keeping a separate set of transition values + * for each grouping set being concurrently processed; for each input tuple + * we update them all, and on group boundaries we reset some initial subset + * of the states (the list of grouping sets is ordered from most specific to + * least specific). One AGG_SORTED node thus handles any number of grouping + * sets as long as they share a sort order. Andres Found initial subset not very clear, even if I probably Andres guessed the right meaning. How about: * [...], and on group boundaries we reset those states * (starting at the front of the list) whose grouping values have * changed (the list of grouping sets is ordered from most specific to * least specific). One AGG_SORTED node thus handles any number [...] sounds good. + * To handle multiple grouping sets that _don't_ share a sort order, we use + * a different strategy. An AGG_CHAINED node receives rows in sorted order + * and returns them unchanged, but computes transition values for its own + * list of grouping sets. At group boundaries, rather than returning the + * aggregated row (which is incompatible with the input rows), it writes it + * to a side-channel in the form of a tuplestore. Thus, a number of + * AGG_CHAINED nodes are associated with a single AGG_SORTED node (the + * chain head), which creates the side channel and, when it has returned + * all of its own data, returns the tuples from the tuplestore to its own + * caller. Andres This paragraph deserves to be expanded imo. OK, but what in particular needs clarifying? I'm not sure ;). I obviously was a bit tired... Andres Are you intending to resolve this before an eventual commit? ... Andres Possibly after the 'structural' issues are resolved? Or do you Andres think this can safely be put of for another release? I think the feature is useful even without AGG_HASHED, even though that means it can sometimes be beaten on performance by using UNION ALL of many separate GROUP BYs; but I'd defer to the opinions of others on that point. I agree. Andres * Arguably this converts the execution *tree* into a DAG. Tom Andres seems to be rather uncomfortable with that. I am wondering Andres whether this really is a big deal - essentially this only Andres happens in a relatively 'isolated' part of the tree right? Andres I.e. if those chained together nodes were considered one node, Andres there would not be any loops? Additionally, the way Andres parametrized scans works already essentially violates the Andres tree paradigma somewhat. The major downsides as I see them with the current approach are: 1. It makes plans (and hence explain output) nest very deeply if you have complex grouping sets (especially cubes with high dimensionality). That doesn't concern me overly much. If we feel the need to fudge the explain output we certainly can. 2. A union-based approach would have a chance of including AGG_HASHED support without any significant code changes, just by using one HashAgg node per qualifying grouping set. However, this would be potentially significantly slower than teaching HashAgg to do multiple grouping sets, and memory usage would be an issue. Your however imo pretty much disqualifies that as an argument. The obvious alternative is this: - CTE x - entire input subplan here - Append - GroupAggregate - Sort - CTE Scan x - GroupAggregate - Sort - CTE Scan x - HashAggregate - CTE Scan x [...] which was basically what we expected to do originally. But all of the existing code to deal with CTE / CTEScan is based on the assumption that each CTE has a rangetable entry before planning starts, and it is completely non-obvious how to arrange to generate such CTEs on the fly while planning. Tom offered in December to look into that aspect for us, and of course we've heard nothing about it since. I find Noah's argument about this kind of structure pretty convincing. We'd either increase the number of reads, or baloon the amount of memory if we'd manage to find a structure that'd allow several of the aggregates to be computed at the same
Re: [HACKERS] Final Patch for GROUPING SETS
I think the problem is just that for each variable, in each grouping set - a very large number in a large cube - we're recursing through the whole ChainAggregate tree, as each Var just points to a var one level lower. For small values of very large, that is. Had a little thinko there. Its still fault of recursing down all these levels, doing nontrivial work each time. -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On Tue, May 12, 2015 at 4:55 AM, Andres Freund and...@anarazel.de wrote: On 2015-05-11 19:04:32 -0400, Tom Lane wrote: I think there's nobody, or at least very few people, who are getting paid to find/fix bugs rather than write cool new features. This is problematic. It doesn't help when key committers are overwhelmed by trying to process other peoples' patches. (And no, I'm not sure that appoint more committers would improve matters. What we've got is too many barely-good-enough patches. Tweaking the process to let those into the tree faster will not result in better quality.) +many Except perhaps that I'd expand find/fix bugs to include review and integrate patches. Because I think few people are paid to do that either. Well said and another thing to add to your point is helping in supporting the other people's ideas by providing usecase and or much more robust design that can be accepted in community. I think one of the reasons for the same is that there is no reasonable guarantee that if a person spends good amount of time on review, helping other patches in design phase and fixing bugs, his feature patch/es will be given more priority which makes it difficult to bargain with one's manager or company to get more time to involve in such activities. I think if the current process of development includes some form of prioritization for the feature patches by people who spend more time in helping other patches/maintenance, then it can improve the situation. Currently, we do have some system in CF process which suggest that a person has to review equal number and complexity of patches as he or she submits for others to review, but I am not sure if that is followed strictly and is sufficient. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Multi-xacts and our process problem
Robert Haas wrote: On Mon, May 11, 2015 at 7:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think there's nobody, or at least very few people, who are getting paid to find/fix bugs rather than write cool new features. This is problematic. It doesn't help when key committers are overwhelmed by trying to process other peoples' patches. (And no, I'm not sure that appoint more committers would improve matters. What we've got is too many barely-good-enough patches. Tweaking the process to let those into the tree faster will not result in better quality.) I agree, although generally I think committers are responsible for fixing what they commit, and I've certainly dropped everything a few times to do so. And people who will someday become committers are generally the sorts of people who do that, too. Perhaps we've relied overmuch on that in some cases - e.g. I really haven't paid much attention to the multixact stuff until lately, because I assumed that it was Alvaro's problem. And maybe that's not right. But I know that when a serious bug is found in something I committed, I expect that if anyone else fixes it, that's a bonus. For the record, I share the responsibility over committed items principle, and I adhere to it to as full an extent as possible. Whenever possible I try to enlist the submitter's help for a fix, but if they do not respond I consider whatever fix to be on me. (I have dropped everything to get fixes done, on several occasions.) As for multixacts, since it's what brings up this thread, many of you realize that the amount of time I have spent fixing issues post-facto is enormous. If I had a glimpse of the effort that the bugfixing would cost, I would have certainly dropped it -- spending more time on it before commit was out of the question. I appreciate the involvement of others in the fixes that became necessary. One lesson I have learned from all this is to try to limit the additional complexity from any individual patch. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
PostgreSQL (=9.4) trend to smooth buffer write smooth in a checkpoint_completion_target (checkpoint_timeout or checkpoint_segments), but when we use synchronous_commit=off, there is a little problem for the checkpoint_segments target, because xlog write fast(for full page write which the first page write after checkpoint), so checkpointer cann't sleep and write buffer not smooth. There is an test: # stap -DMAXSKIPPED=10 -v 1 -e ' global s_var, e_var, stat_var; /* probe smgr__md__read__start(ForkNumber, BlockNumber, Oid, Oid, Oid, int); */ probe process(/opt/pgsql/bin/postgres).mark(smgr__md__read__start) { s_var[pid(),1] = gettimeofday_us() } /* probe smgr__md__read__done(ForkNumber, BlockNumber, Oid, Oid, Oid, int, int, int); */ probe process(/opt/pgsql/bin/postgres).mark(smgr__md__read__done) { e_var[pid(),1] = gettimeofday_us() if ( s_var[pid(),1] 0 ) stat_var[pid(),1] e_var[pid(),1] - s_var[pid(),1] } /* probe smgr__md__write__start(ForkNumber, BlockNumber, Oid, Oid, Oid, int); */ probe process(/opt/pgsql/bin/postgres).mark(smgr__md__write__start) { s_var[pid(),2] = gettimeofday_us() } /* probe smgr__md__write__done(ForkNumber, BlockNumber, Oid, Oid, Oid, int, int, int); */ probe process(/opt/pgsql/bin/postgres).mark(smgr__md__write__done) { e_var[pid(),2] = gettimeofday_us() if ( s_var[pid(),2] 0 ) stat_var[pid(),2] e_var[pid(),2] - s_var[pid(),2] } probe process(/opt/pgsql/bin/postgres).mark(buffer__sync__start) { printf(buffer__sync__start num_buffers: %d, dirty_buffers: %d\n, $NBuffers, $num_to_write) } probe process(/opt/pgsql/bin/postgres).mark(checkpoint__start) { printf(checkpoint start\n) } probe process(/opt/pgsql/bin/postgres).mark(checkpoint__done) { printf(checkpoint done\n) } probe timer.s(1) { foreach ([v1,v2] in stat_var +) { if ( @count(stat_var[v1,v2]) 0 ) { printf(r1_or_w2 %d, pid: %d, min: %d, max: %d, avg: %d, sum: %d, count: %d\n, v2, v1, @min(stat_var[v1,v2]), @max(stat_var[v1,v2]), @avg(stat_var[v1,v2]), @sum(stat_var[v1,v2]), @count(stat_var[v1,v2])) } } printf(--end-\n) delete s_var delete e_var delete stat_var }' Use the test table and data: create table tbl(id primary key,info text,crt_time timestamp); insert into tbl select generate_series(1,5000),now(),now(); Use pgbench test it. $ vi test.sql \setrandom id 1 5000 update tbl set crt_time=now() where id = :id ; $ pgbench -M prepared -n -r -f ./test.sql -P 1 -c 28 -j 28 -T 1 When on schedule checkpoint occure , the tps: progress: 255.0 s, 58152.2 tps, lat 0.462 ms stddev 0.504 progress: 256.0 s, 31382.8 tps, lat 0.844 ms stddev 2.331 progress: 257.0 s, 14615.5 tps, lat 1.863 ms stddev 4.554 progress: 258.0 s, 16258.4 tps, lat 1.652 ms stddev 4.139 progress: 259.0 s, 17814.7 tps, lat 1.526 ms stddev 4.035 progress: 260.0 s, 14573.8 tps, lat 1.825 ms stddev 5.592 progress: 261.0 s, 16736.6 tps, lat 1.600 ms stddev 5.018 progress: 262.0 s, 19060.5 tps, lat 1.448 ms stddev 4.818 progress: 263.0 s, 20553.2 tps, lat 1.290 ms stddev 4.146 progress: 264.0 s, 26223.0 tps, lat 1.042 ms stddev 3.711 progress: 265.0 s, 31953.0 tps, lat 0.836 ms stddev 2.837 progress: 266.0 s, 43396.1 tps, lat 0.627 ms stddev 1.615 progress: 267.0 s, 50487.8 tps, lat 0.533 ms stddev 0.647 progress: 268.0 s, 53537.7 tps, lat 0.502 ms stddev 0.598 progress: 269.0 s, 54259.3 tps, lat 0.496 ms stddev 0.624 progress: 270.0 s, 56139.8 tps, lat 0.479 ms stddev 0.524 The parameters for onschedule checkpoint: checkpoint_segments = 512 checkpoint_timeout = 5min checkpoint_completion_target = 0.9 stap's output : there is 156467 dirty blocks, we can see the buffer write per second, write buffer is not smooth between time target. but between xlog target. 156467/(4.5*60*0.9) = 579.5 write per second. checkpoint start buffer__sync__start num_buffers: 262144, dirty_buffers: 156467 r1_or_w2 2, pid: 19848, min: 41, max: 1471, avg: 49, sum: 425291, count: 8596 --end- r1_or_w2 2, pid: 19848, min: 41, max: 153, avg: 49, sum: 450597, count: 9078 --end- r1_or_w2 2, pid: 19848, min: 41, max: 643, avg: 51, sum: 429193, count: 8397 --end- r1_or_w2 2, pid: 19848, min: 41, max: 1042, avg: 55, sum: 449091, count: 8097 --end- r1_or_w2 2, pid: 19848, min: 41, max: 254, avg: 52, sum: 296668, count: 5617 --end- r1_or_w2 2, pid: 19848, min: 39, max: 171, avg: 54, sum: 321027, count: 5851 --end- r1_or_w2 2, pid: 19848, min: 41, max: 138, avg: 60, sum: 300056, count: 4953 --end- r1_or_w2 2, pid: 19848, min: 42, max:
[HACKERS] RFC: Non-user-resettable SET SESSION AUTHORISATION
Hi all For some time I've wanted a way to SET SESSION AUTHORISATION or SET ROLE in a way that cannot simply be RESET, so that a connection may be handed to a less-trusted service or application to do some work with. This is most useful for connection pools, where it's currently necessary to have a per-user pool, to trust users not to do anything naughty, or to filter the functions and commands they can run through a whitelist to stop them trying to escalate rights to the pooler user. In the short term I'd like to: * Add a WITH COOKIE option to SET SESSION AUTHORIZATION, which takes an app-generated code (much like PREPARE TRANSACTION does). * Make DISCARD ALL, RESET SESSION AUTHORIZATION, etc, also take WITH COOKIE. If session authorization was originally set with a cookie value, the same cookie value must be passed or an ERROR will be raised when RESET is attempted. * A second SET SESSION AUTHORIZATION without a prior RESET would be rejected with an ERROR if the first SET used a cookie value. This can be done without protocol-level changes and with no backward compatibility impact to existing applications. Any objections? These commands will appear in the logs if log_statement = 'all', but the codes are transient cookie values, not passwords. They'll be visible in pg_stat_activity but only to the privileged user. It'll probably be necessary to clear the last command string when executing SET SESSION AUTHORIZATION so the new user can't snoop the cookie value from a concurrent connection, but that should be simple enough. As is currently the case, poolers will still have to use a superuser connection if they want to pool across users. In the longer term I want to add a protocol-level equivalent that lets a session switch session authorization or role, for efficiency and log-spam reasons. I'm also interested in a way to allow SET SESSION AUTHORIZATION to a list of permitted roles when run as a non-superuser, for connection pool use. SET ROLE might do, but it's more visible to apps, wheras SET SESSION AUTHORIZATION really makes the connection appear to become the target user. That's later though - first, -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Patch to improve a few appendStringInfo* calls
On 4/11/15 6:25 AM, David Rowley wrote: I've attached a small patch which just fixes a few appendStringInfo* calls that are not quite doing things the way that it was intended. done -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On 05/11/2015 04:04 PM, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: On Mon, May 11, 2015 at 03:42:26PM -0700, Joshua Drake wrote: What I am arguing is that the release cycle is at least a big part of the problem. We are trying to get so many new features that bugs are increasing and quality is decreasing. Now that is an interesting observation --- are we too focused on patches and features to realize when we need to seriously revisit an issue? I think there's nobody, or at least very few people, who are getting paid to find/fix bugs rather than write cool new features. This is problematic. It doesn't help when key committers are overwhelmed by trying to process other peoples' patches. (And no, I'm not sure that appoint more committers would improve matters. What we've got is too many barely-good-enough patches. Tweaking the process to let those into the tree faster will not result in better quality.) Exactly correct. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing busted citext function declarations
David E. Wheeler da...@justatheory.com writes: On May 5, 2015, at 9:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: In http://www.postgresql.org/message-id/bn1pr04mb37467aa1d412223b3d4a595df...@bn1pr04mb374.namprd04.prod.outlook.com it's revealed that the citext extension misdeclares its versions of regexp_matches(): they should return SETOF text[] but they're marked as returning just text[]. I wanted to make sure my backport was fixed for this, but it turns out it was already fixed as of this commit: https://github.com/theory/citext/commit/99c925f Note that I credited you for the spot --- way back in October 2009! Pretty confused how the same change wasnât made to the core contrib module back then. Me too. Something fell through the cracks rather badly there :-(. Would you check your commit history to see if anything else got missed? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On 05/11/2015 04:18 PM, Simon Riggs wrote: On 11 May 2015 at 23:47, Bruce Momjian br...@momjian.us mailto:br...@momjian.us wrote: On Mon, May 11, 2015 at 03:42:26PM -0700, Joshua Drake wrote: The releases themselves are not the problem, but rather the volume of bugs and our slowness in getting additional people involved to remove data corruption bugs more quickly and systematically. Our reputation for reliability has been harmed by this inactivity. What I am arguing is that the release cycle is at least a big part of the problem. We are trying to get so many new features that bugs are increasing and quality is decreasing. Now that is an interesting observation --- are we too focused on patches and features to realize when we need to seriously revisit an issue? I think we are unused to bugs. We have a much lower bug rate than any other system. True we are used to having extremely high quality releases but if you look at the release notes since say 9.2, we are seeing a much larger increase in bug rates. It is true that generally speaking our bug rate is low in comparison to other databases. That said, I think we are also resting on some laurels here per my previous paragraph. I think we seriously need to review our policy of adding major new features and have them enabled by default with no parameter to disable them. In the early years of PostgreSQL everything had an off switch, e.g. stats, bgwriter and even autovacuum defaulted to off for many years. That's interesting although I am unsure of the cost of such a thing. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
Hi, Let me back to the original concern and show possible options we can take here. At least, the latest master is not usable to implement custom-join logic without either of these options. option-1) Revert create_plan_recurse() to non-static function for extensions. It is the simplest solution, however, it is still gray zone which functions shall be public and whether we deal with the non-static functions as a stable API or not. IMO, we shouldn't treat non-static functions as stable APIs, even if it can be called by extensions not only codes in another source file. In fact, we usually changes definition of non-static functions even though we know extensions uses. It is role of extension to follow up the feature across major version. option-2) Tom's suggestion. Add a new list field of Path nodes on CustomPath, then create_customscan_plan() will call static create_plan_recurse() function to construct child plan nodes. Probably, the attached patch will be an image of this enhancement, but not tested yet, of course. Once we adopt this approach, I'll adjust my PG-Strom code towards the new interface within 2 weeks and report problems if any. option-3) Enforce authors of custom-scan provider to copy and paste createplan.c. I really don't want this option and nobody will be happy. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: Kaigai Kouhei(海外 浩平) Sent: Monday, May 11, 2015 12:48 PM To: 'Andres Freund'; Robert Haas Cc: Tom Lane; Kohei KaiGai; Thom Brown; Shigeru Hanada; pgsql-hackers@postgreSQL.org Subject: RE: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) On 2015-05-10 21:26:26 -0400, Robert Haas wrote: On Sun, May 10, 2015 at 8:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: This commit reverts create_plan_recurse() as static function. Yes. I am not convinced that external callers should be calling that, and would prefer not to enlarge createplan.c's API footprint without a demonstration that this is right and useful. (This is one of many ways in which this patch is suffering from having gotten committed without submitted use-cases.) Wasn't there a submitted use case? IIRC Kaigai had referenced some pg-strom (?) code using it? I'm failing to see how create_plan_recurse() being exposed externally is related to having gotten committed without submitted use-cases. Even if submitted, presumably as simple as possible code, doesn't use it, that's not a proof that less simple code does not need it. Yes, PG-Strom code uses create_plan_recurse() to construct child plan node of the GPU accelerated custom-join logic, once it got chosen. Here is nothing special. It calls create_plan_recurse() as built-in join path doing on the underlying inner/outer paths. It is not difficult to submit as a working example, however, its total code size (excludes GPU code) is 25KL at this moment. I'm not certain whether it is a simple example. Your unwillingness to make functions global or to stick PGDLLIMPORT markings on variables that people want access to is hugely handicapping extension authors. Many people have complained about that on multiple occasions. Frankly, I find it obstructionist and petty. While I don't find the tone of the characterization super helpful, I do tend to agree that we're *far* too conservative on that end. I've now seen a significant number of extension that copied large swathes of code just to cope with individual functions not being available. And even cases where that lead to minor forks with such details changed. I may have to join the members? I know that I'm fearful of asking for functions being made public. Because it'll invariably get into a discussion of merits that's completely out of proportion with the size of the change. And if I, who has been on the list for a while now, am afraid in that way, you can be sure that others won't even dare to ask, lest argue their way through. I think the problem is that during development the default often is to create function as static if they're used only in one file. Which is fine. But it really doesn't work if it's a larger battle to change single incidences. Besides the pain of having to wait for the next major release... Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com custom-join-children.patch Description: custom-join-children.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On 05/11/2015 02:15 PM, Bruce Momjian wrote: On Mon, May 11, 2015 at 02:11:48PM -0700, Joshua Drake wrote: Here are some options Slow down the release cycle The shortness of the release cycle puts a preference on adding features versus providing a more mature outcome. or Increase the release cycle Moving to a Ubuntu style release cycle would allow us to have a window to scratch the itch but with the eventual (and known) release of something that is LTS. The releases themselves are not the problem, but rather the volume of bugs and our slowness in getting additional people involved to remove data corruption bugs more quickly and systematically. Our reputation for reliability has been harmed by this inactivity. What I am arguing is that the release cycle is at least a big part of the problem. We are trying to get so many new features that bugs are increasing and quality is decreasing. If we change the release cycle it will encourage an increase in eyeballs on code we are developing because people aren't going to be in such a rush to get this feature done for this release. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On further consideration ... I don't much like the division of labor between LockForeignRow and FetchForeignRow. In principle that would lead to not one but two extra remote accesses per locked row in SELECT FOR UPDATE, at least in the case that an EvalPlanQual recheck is required. (I see that in your prototype patch for postgres_fdw you attempt to avoid that by saving a retrieved tuple in LockForeignRow and then returning it in FetchForeignRow, but that seems extremely ugly and bug-prone, since there is nothing in the API spec insisting that those calls match up one-to-one.) The fact that locking and fetching a tuple are separate operations in the heapam API is a historical artifact that probably doesn't make sense to duplicate in the FDW API. Another problem is that we fail to detect whether an EvalPlanQual recheck is required during a SELECT FOR UPDATE on a remote table, which we certainly ought to do if the objective is to make postgres_fdw semantics match the local ones. What I'm inclined to do is merge LockForeignRow and FetchForeignRow into one operation, which would have the semantics of returning a palloc'd tuple, or NULL on a SKIP LOCKED failure, and it would be expected to acquire a lock according to erm-markType (in particular, no lock just a re-fetch for ROW_MARK_REFERENCE). In addition it needs some way of reporting that the returned row is an updated version rather than the original. Probably just an extra output boolean would do for that. This'd require some refactoring in nodeLockRows, because we'd want to be able to hang onto the returned tuple without necessarily provoking an EvalPlanQual cycle, but it doesn't look too bad. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multixid hindsight design
Heikki Linnakangas hlinn...@iki.fi writes: So the lesson here is that having a permanent pg_multixact is not nice, and we should get rid of it. Here's how to do that: That would be cool, but ... Looking at the tuple header, the CID and CTID fields are only needed, when either xmin or xmax is running. Almost: in a HOT-updated tuple, CTID is required even after xmax has committed, but since it's a HOT update, the new tuple is always on the same page so you only need the offsetnumber part. I think this is totally wrong. HOT update or not, you need the forward link represented by ctid not just until xmin/xmax commit, but until they're in the past according to all active snapshots. That's so that other transactions can chain forward to the current version of a tuple which they found according to their snapshots. It might be you can salvage the idea anyway, since it's still true that the forward links wouldn't be needed after a crash and restart. But the argument as stated is wrong. (There's also the question of forensic requirements, although I'm aware that it's barely worth bringing that up since nobody else here seems to put much weight on that.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
Bruce Momjian br...@momjian.us writes: On Mon, May 11, 2015 at 03:42:26PM -0700, Joshua Drake wrote: What I am arguing is that the release cycle is at least a big part of the problem. We are trying to get so many new features that bugs are increasing and quality is decreasing. Now that is an interesting observation --- are we too focused on patches and features to realize when we need to seriously revisit an issue? I think there's nobody, or at least very few people, who are getting paid to find/fix bugs rather than write cool new features. This is problematic. It doesn't help when key committers are overwhelmed by trying to process other peoples' patches. (And no, I'm not sure that appoint more committers would improve matters. What we've got is too many barely-good-enough patches. Tweaking the process to let those into the tree faster will not result in better quality.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On 11 May 2015 at 23:47, Bruce Momjian br...@momjian.us wrote: On Mon, May 11, 2015 at 03:42:26PM -0700, Joshua Drake wrote: The releases themselves are not the problem, but rather the volume of bugs and our slowness in getting additional people involved to remove data corruption bugs more quickly and systematically. Our reputation for reliability has been harmed by this inactivity. What I am arguing is that the release cycle is at least a big part of the problem. We are trying to get so many new features that bugs are increasing and quality is decreasing. Now that is an interesting observation --- are we too focused on patches and features to realize when we need to seriously revisit an issue? I think we are unused to bugs. We have a much lower bug rate than any other system. I think we seriously need to review our policy of adding major new features and have them enabled by default with no parameter to disable them. In the early years of PostgreSQL everything had an off switch, e.g. stats, bgwriter and even autovacuum defaulted to off for many years. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] deparsing utility commands
David Steele wrote: I have reviewed and tested this patch and everything looks good to me. It also looks like you added better coverage for schema DDL, which is a welcome addition. Thanks -- I have pushed this now. My hope is to get this test module extended quite a bit, not only to cover existing commands, but also so that it causes future changes to cause failure unless command collection is considered. (In a previous version of this patch, there was a test mode that ran everything in the serial_schedule of regular regression tests. That was IMV a good way to ensure that new commands were also tested to run under command collection. That hasn't been enabled in the new test module, and I think it's necessary.) If anyone wants to contribute to the test module so that more is covered, that would be much appreciated. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-xacts and our process problem
On Mon, May 11, 2015 at 03:42:26PM -0700, Joshua Drake wrote: The releases themselves are not the problem, but rather the volume of bugs and our slowness in getting additional people involved to remove data corruption bugs more quickly and systematically. Our reputation for reliability has been harmed by this inactivity. What I am arguing is that the release cycle is at least a big part of the problem. We are trying to get so many new features that bugs are increasing and quality is decreasing. Now that is an interesting observation --- are we too focused on patches and features to realize when we need to seriously revisit an issue? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing busted citext function declarations
Tom, On May 5, 2015, at 9:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: In http://www.postgresql.org/message-id/bn1pr04mb37467aa1d412223b3d4a595df...@bn1pr04mb374.namprd04.prod.outlook.com it's revealed that the citext extension misdeclares its versions of regexp_matches(): they should return SETOF text[] but they're marked as returning just text[]. I wanted to make sure my backport was fixed for this, but it turns out it was already fixed as of this commit: https://github.com/theory/citext/commit/99c925f Note that I credited you for the spot --- way back in October 2009! Pretty confused how the same change wasn’t made to the core contrib module back then. Best, David smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Multi-xacts and our process problem
On 2015-05-11 19:04:32 -0400, Tom Lane wrote: I think there's nobody, or at least very few people, who are getting paid to find/fix bugs rather than write cool new features. This is problematic. It doesn't help when key committers are overwhelmed by trying to process other peoples' patches. (And no, I'm not sure that appoint more committers would improve matters. What we've got is too many barely-good-enough patches. Tweaking the process to let those into the tree faster will not result in better quality.) +many Except perhaps that I'd expand find/fix bugs to include review and integrate patches. Because I think few people are paid to do that either. I now partially am (which obviously isn't sufficient). There's no way it's possible to e.g. work on integrating something like upsert in a reasonable timeframe otherwise. The lack of paid time to integrate stuff properly also leads to part of the quality problem, besides delaying stuff. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Leftovers after dcae5fac (improve speed of make check-world) in git tree with check-world
Hi all, Since dcae5fac, we use a central path ($GIT_ROOT/tmp_install) to store the installation deliverables during tests. Each module uses a .gitignore with more or less those entries: /log/ /results/ /tmp_check/ First note that /log/ is not used anymore, replaced by /tmp_check/log/ so we could just remove it. Also, as we do not ignore regression.diffs and regression.out on purpose to check easily what has failed, what about removing /tmp_check/ as well in those .gitignore files? It now contains pgdata/ and log/, so it looks useful to me to show it with git status to track it in case of failure as well. Thoughts? Attached is a patch with all those files updated. -- Michael diff --git a/contrib/btree_gin/.gitignore b/contrib/btree_gin/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/btree_gin/.gitignore +++ b/contrib/btree_gin/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/btree_gist/.gitignore b/contrib/btree_gist/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/btree_gist/.gitignore +++ b/contrib/btree_gist/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/citext/.gitignore b/contrib/citext/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/citext/.gitignore +++ b/contrib/citext/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/cube/.gitignore b/contrib/cube/.gitignore index cb4c989..a6484a0 100644 --- a/contrib/cube/.gitignore +++ b/contrib/cube/.gitignore @@ -1,6 +1,4 @@ /cubeparse.c /cubescan.c # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/dblink/.gitignore b/contrib/dblink/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/dblink/.gitignore +++ b/contrib/dblink/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/dict_int/.gitignore b/contrib/dict_int/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/dict_int/.gitignore +++ b/contrib/dict_int/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/dict_xsyn/.gitignore b/contrib/dict_xsyn/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/dict_xsyn/.gitignore +++ b/contrib/dict_xsyn/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/earthdistance/.gitignore b/contrib/earthdistance/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/earthdistance/.gitignore +++ b/contrib/earthdistance/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/file_fdw/.gitignore b/contrib/file_fdw/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/file_fdw/.gitignore +++ b/contrib/file_fdw/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/hstore/.gitignore b/contrib/hstore/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/hstore/.gitignore +++ b/contrib/hstore/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/hstore_plperl/.gitignore b/contrib/hstore_plperl/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/hstore_plperl/.gitignore +++ b/contrib/hstore_plperl/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/hstore_plpython/.gitignore b/contrib/hstore_plpython/.gitignore index ce6fab9..351dd41 100644 --- a/contrib/hstore_plpython/.gitignore +++ b/contrib/hstore_plpython/.gitignore @@ -1,6 +1,4 @@ # Generated subdirectories /expected/python3/ -/log/ /results/ /sql/python3/ -/tmp_check/ diff --git a/contrib/intarray/.gitignore b/contrib/intarray/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/intarray/.gitignore +++ b/contrib/intarray/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/ltree/.gitignore b/contrib/ltree/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/ltree/.gitignore +++ b/contrib/ltree/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/ltree_plpython/.gitignore b/contrib/ltree_plpython/.gitignore index ce6fab9..351dd41 100644 --- a/contrib/ltree_plpython/.gitignore +++ b/contrib/ltree_plpython/.gitignore @@ -1,6 +1,4 @@ # Generated subdirectories /expected/python3/ -/log/ /results/ /sql/python3/ -/tmp_check/ diff --git a/contrib/pg_trgm/.gitignore b/contrib/pg_trgm/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/pg_trgm/.gitignore +++ b/contrib/pg_trgm/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git a/contrib/pgcrypto/.gitignore b/contrib/pgcrypto/.gitignore index 5dcb3ff..19b6c5b 100644 --- a/contrib/pgcrypto/.gitignore +++ b/contrib/pgcrypto/.gitignore @@ -1,4 +1,2 @@ # Generated subdirectories -/log/ /results/ -/tmp_check/ diff --git
Re: [HACKERS] Multi-xacts and our process problem
On Mon, May 11, 2015 at 7:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think there's nobody, or at least very few people, who are getting paid to find/fix bugs rather than write cool new features. This is problematic. It doesn't help when key committers are overwhelmed by trying to process other peoples' patches. (And no, I'm not sure that appoint more committers would improve matters. What we've got is too many barely-good-enough patches. Tweaking the process to let those into the tree faster will not result in better quality.) I agree, although generally I think committers are responsible for fixing what they commit, and I've certainly dropped everything a few times to do so. And people who will someday become committers are generally the sorts of people who do that, too. Perhaps we've relied overmuch on that in some cases - e.g. I really haven't paid much attention to the multixact stuff until lately, because I assumed that it was Alvaro's problem. And maybe that's not right. But I know that when a serious bug is found in something I committed, I expect that if anyone else fixes it, that's a bonus. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor ON CONFLICT related fixes
HI, Don't have time to go through this in depth. On 2015-05-11 18:53:22 -0700, Peter Geoghegan wrote: Note that the patch proposes to de-support CREATE RULE with an alternative action involving ON CONFLICT DO UPDATE. Such a rule seems particularly questionable to me, and I'm pretty sure it was understood that only ON CONFLICT DO NOTHING should be supported as an action for a rule (recall that INSERT statements with ON CONFLICT can, in general, never target relations with rules). At least, I believe Heikki said that. Deparsing with an inference clause is now correctly supported. However, user-defined rules with ON CONFLICT DO UPDATE are now formally disallowed/unsupported. It seemed there would be significant complexity involved in making this work correctly with the EXCLUDED.* pseudo-relation, which was not deemed worthwhile. Such a user-defined rule seems very questionable anyway. Do I understand correctly that you cut this out because there essentially was a ruleutils bug with with EXCLUDED? If so, I don't find that acceptable. I'm pretty strictly convincded that independent of rule support, we shouldn't add things to insert queries that get_query_def can't deparse. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor ON CONFLICT related fixes
On 2015-05-11 19:33:02 -0700, Peter Geoghegan wrote: On Mon, May 11, 2015 at 7:22 PM, Peter Geoghegan p...@heroku.com wrote: You just get an error within the optimizer following rewriting of a query involving the application of a rule that specifies an ON CONFLICT DO UPDATE alternative/DO INSTEAD action. I found it would say: variable not found in subplan target lists. BTW, that error was only raised when the EXCLUDED.* pseudo-alias was actually used, which is why our previous testing missed it. You should try to understand why it's failing. Just prohibiting the rules, without understanding what's actually going on, could very well hide a real bug. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor ON CONFLICT related fixes
On Mon, May 11, 2015 at 7:34 PM, Andres Freund and...@anarazel.de wrote: You should try to understand why it's failing. Just prohibiting the rules, without understanding what's actually going on, could very well hide a real bug. It's not as if I have no idea. ReplaceVarsFromTargetList() is probably quite confused by all this, because the passed nomatch_varno argument is often rt_index -- but what about EXCLUDED.*? adjustJoinTreeList() does not know anything about EXCLUDED.* either. I see little practical reason to make the rewriter do any better. When I debugged the problem of the optimizer raising that target lists error with a rule with an action with EXCLUDED.* (within setrefs.c's fix_join_expr_mutator()), it looked like an off-by-one issue here: /* If it's for acceptable_rel, adjust and return it */ if (var-varno == context-acceptable_rel) { var = copyVar(var); var-varno += context-rtoffset; if (var-varnoold 0) var-varnoold += context-rtoffset; return (Node *) var; } -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor ON CONFLICT related fixes
Attached patch fixes some issues with ON CONFLICT DO UPDATE/DO NOTHING. There is a commit message which explains the changes at a high level. The only real bug fix is around deparsing by ruleutils.c. Note that the patch proposes to de-support CREATE RULE with an alternative action involving ON CONFLICT DO UPDATE. Such a rule seems particularly questionable to me, and I'm pretty sure it was understood that only ON CONFLICT DO NOTHING should be supported as an action for a rule (recall that INSERT statements with ON CONFLICT can, in general, never target relations with rules). At least, I believe Heikki said that. Thoughts? -- Peter Geoghegan From 349a0a1fee6d86de8c5cc4120474ddc48aeb43e0 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan peter.geoghega...@gmail.com Date: Mon, 11 May 2015 15:37:54 -0700 Subject: [PATCH] Fixes to a variety of minor ON CONFLICT issues Deparsing with an inference clause is now correctly supported. However, user-defined rules with ON CONFLICT DO UPDATE are now formally disallowed/unsupported. It seemed there would be significant complexity involved in making this work correctly with the EXCLUDED.* pseudo-relation, which was not deemed worthwhile. Such a user-defined rule seems very questionable anyway. In passing, re-factor InferenceElem representation of opclass, to defer opfamily lookup for Relation index matching until index inference proper (i.e., within the optimizer). This is done for the convenience of the new ruleutils.c code, but independently make senses. Finally, fix a few typos, and rename a variable -- candidates seemed like a misnomer for the return value of infer_arbiter_indexes(). --- contrib/pg_stat_statements/pg_stat_statements.c | 3 +- doc/src/sgml/ref/create_rule.sgml | 5 ++ src/backend/executor/nodeModifyTable.c | 2 +- src/backend/nodes/copyfuncs.c | 3 +- src/backend/nodes/equalfuncs.c | 3 +- src/backend/nodes/outfuncs.c| 3 +- src/backend/nodes/readfuncs.c | 3 +- src/backend/optimizer/plan/setrefs.c| 3 +- src/backend/optimizer/util/plancat.c| 34 +++ src/backend/parser/parse_clause.c | 16 ++ src/backend/parser/parse_utilcmd.c | 5 ++ src/backend/utils/adt/ruleutils.c | 75 - src/include/nodes/primnodes.h | 3 +- src/test/regress/expected/insert_conflict.out | 2 +- src/test/regress/expected/rules.out | 67 -- src/test/regress/sql/rules.sql | 32 +++ 16 files changed, 177 insertions(+), 82 deletions(-) diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 6abe3f0..f97cc2c 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -2637,8 +2637,7 @@ JumbleExpr(pgssJumbleState *jstate, Node *node) InferenceElem *ie = (InferenceElem *) node; APP_JUMB(ie-infercollid); -APP_JUMB(ie-inferopfamily); -APP_JUMB(ie-inferopcinputtype); +APP_JUMB(ie-inferopclass); JumbleExpr(jstate, ie-expr); } break; diff --git a/doc/src/sgml/ref/create_rule.sgml b/doc/src/sgml/ref/create_rule.sgml index 53fdf56..46a8a7c 100644 --- a/doc/src/sgml/ref/create_rule.sgml +++ b/doc/src/sgml/ref/create_rule.sgml @@ -281,6 +281,11 @@ UPDATE mytable SET name = 'foo' WHERE id = 42; match the condition literalid = 42/literal. This is an implementation restriction that might be fixed in future releases. /para + para + Presently, a rule alternative action cannot contain literalON + CONFLICT DO UPDATE/literal. This is an implementation + restriction that might be fixed in future releases. + /para /refsect1 refsect1 diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 34435c7..55a1cc7 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -414,7 +414,7 @@ ExecInsert(ModifyTableState *mtstate, estate, true, specConflict, arbiterIndexes); - /* adjust the tuple's state accordingly */ + /* adjust the tuple's state */ if (!specConflict) heap_finish_speculative(resultRelationDesc, tuple); else diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 76b63af..957c2bc 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -1803,8 +1803,7 @@ _copyInferenceElem(const InferenceElem *from) COPY_NODE_FIELD(expr); COPY_SCALAR_FIELD(infercollid); - COPY_SCALAR_FIELD(inferopfamily); - COPY_SCALAR_FIELD(inferopcinputtype); + COPY_SCALAR_FIELD(inferopclass); return newnode; } diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index e032142..f26626e 100644 --- a/src/backend/nodes/equalfuncs.c +++
Re: [HACKERS] Minor ON CONFLICT related fixes
On Mon, May 11, 2015 at 6:53 PM, Peter Geoghegan p...@heroku.com wrote: Attached patch fixes some issues with ON CONFLICT DO UPDATE/DO NOTHING. There is a commit message which explains the changes at a high level. I just realized that there is a silly bug here. Attached is a fix that applies on top of the original. -- Peter Geoghegan From 1b32558d188762eb5c7214ea5ae042897e7d004f Mon Sep 17 00:00:00 2001 From: Peter Geoghegan peter.geoghega...@gmail.com Date: Mon, 11 May 2015 19:02:12 -0700 Subject: [PATCH 2/2] Add closing bracket --- src/backend/utils/adt/ruleutils.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 355acc9..aa21693 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -7768,6 +7768,9 @@ get_rule_expr(Node *node, deparse_context *context, get_rule_expr((Node *) iexpr-expr, context, showimplicit); +if (need_parens) + appendStringInfoChar(buf, ')'); + context-varprefix = varprefix; if (iexpr-infercollid) @@ -7782,8 +7785,6 @@ get_rule_expr(Node *node, deparse_context *context, get_opclass_name(inferopclass, inferopcinputtype, buf); } -if (need_parens) - appendStringInfoChar(buf, ')'); } break; -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor ON CONFLICT related fixes
On Mon, May 11, 2015 at 7:11 PM, Andres Freund and...@anarazel.de wrote: Do I understand correctly that you cut this out because there essentially was a ruleutils bug with with EXCLUDED? If so, I don't find that acceptable. I'm pretty strictly convincded that independent of rule support, we shouldn't add things to insert queries that get_query_def can't deparse. No, that wasn't the reason -- deparsing itself works fine. That code remains within ruleutils.c because I agree with this principle of yours. I tested the deparsing logic before making the case added fail as unsupported. If you remove the new ereport() call that relates to non-suppport of ON CONFLICT DO UPDATE as a rule action, then the CREATE RULE still succeeds, and you can deparse the query just fine (by quering pg_rules, typically). You just get an error within the optimizer following rewriting of a query involving the application of a rule that specifies an ON CONFLICT DO UPDATE alternative/DO INSTEAD action. I found it would say: variable not found in subplan target lists. I can't say that I explored the question very thoroughly, but it seems bothersome to have more than one relation involved in a Query involved in rewriting. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor ON CONFLICT related fixes
On Mon, May 11, 2015 at 7:22 PM, Peter Geoghegan p...@heroku.com wrote: You just get an error within the optimizer following rewriting of a query involving the application of a rule that specifies an ON CONFLICT DO UPDATE alternative/DO INSTEAD action. I found it would say: variable not found in subplan target lists. BTW, that error was only raised when the EXCLUDED.* pseudo-alias was actually used, which is why our previous testing missed it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
On Tue, May 12, 2015 at 4:53 AM, Stephen Frost wrote: Yeah, it might not be bad to have tests for all the different lock types and make sure that the permissions match up. I'd probably put those tests into 'permissions.sql' instead though. You mean privileges.sql, right? I just wrote a patch with that. I'll create a new thread with the patch. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/05/12 7:42, Tom Lane wrote: On further consideration ... Thanks for the consideration! I don't much like the division of labor between LockForeignRow and FetchForeignRow. In principle that would lead to not one but two extra remote accesses per locked row in SELECT FOR UPDATE, at least in the case that an EvalPlanQual recheck is required. (I see that in your prototype patch for postgres_fdw you attempt to avoid that by saving a retrieved tuple in LockForeignRow and then returning it in FetchForeignRow, but that seems extremely ugly and bug-prone, since there is nothing in the API spec insisting that those calls match up one-to-one.) The fact that locking and fetching a tuple are separate operations in the heapam API is a historical artifact that probably doesn't make sense to duplicate in the FDW API. I understand your concern about the postgres_fdw patch. However, I think we should divide this into the two routines as the core patch does, because I think that would allow an FDW author to implement these routines so as to improve the efficiency for scenarios that seldom need fetching, by not retrieving and saving locked tuples in LockForeignRow. Another problem is that we fail to detect whether an EvalPlanQual recheck is required during a SELECT FOR UPDATE on a remote table, which we certainly ought to do if the objective is to make postgres_fdw semantics match the local ones. I think that is interesting in theory, but I'm not sure that is worth much in practice. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Improving regression tests to check LOCK TABLE and table permissions
Hi all, As mentioned in this thread, it would be good to have regression tests to test the interactions with permissions and LOCK TABLE: http://www.postgresql.org/message-id/20150511195335.ge30...@tamriel.snowman.net Attached is a patch achieving that. Note that it does some checks on the modes SHARE ACCESS, ROW EXCLUSIVE and ACCESS EXCLUSIVE to check all the code paths of LockTableAclCheck@lockcmds.c. I'll add an entry in the next CF to keep track of it. Regards, -- Michael diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out index 64a9330..c0cd9fa 100644 --- a/src/test/regress/expected/privileges.out +++ b/src/test/regress/expected/privileges.out @@ -1569,3 +1569,86 @@ DROP USER regressuser4; DROP USER regressuser5; DROP USER regressuser6; ERROR: role regressuser6 does not exist +-- permissions with LOCK TABLE +CREATE USER locktable_user; +CREATE TABLE lock_table (a int); +-- LOCK TABLE and SELECT permission +GRANT SELECT ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +\c +REVOKE SELECT ON lock_table FROM locktable_user; +-- LOCK TABLE and INSERT permission +GRANT INSERT ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +\c +REVOKE INSERT ON lock_table FROM locktable_user; +-- LOCK TABLE and UPDATE permission +GRANT UPDATE ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should pass +COMMIT; +\c +REVOKE UPDATE ON lock_table FROM locktable_user; +-- LOCK TABLE and DELETE permission +GRANT DELETE ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should pass +COMMIT; +\c +REVOKE DELETE ON lock_table FROM locktable_user; +-- LOCK TABLE and TRUNCATE permission +GRANT TRUNCATE ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should fail +ERROR: permission denied for relation lock_table +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should pass +COMMIT; +\c +REVOKE TRUNCATE ON lock_table FROM locktable_user; +-- clean up +DROP TABLE lock_table; +DROP USER locktable_user; diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql index 22b54a2..c1837c4 100644 --- a/src/test/regress/sql/privileges.sql +++ b/src/test/regress/sql/privileges.sql @@ -975,3 +975,87 @@ DROP USER regressuser3; DROP USER regressuser4; DROP USER regressuser5; DROP USER regressuser6; + + +-- permissions with LOCK TABLE +CREATE USER locktable_user; +CREATE TABLE lock_table (a int); + +-- LOCK TABLE and SELECT permission +GRANT SELECT ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should fail +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should fail +ROLLBACK; +\c +REVOKE SELECT ON lock_table FROM locktable_user; + +-- LOCK TABLE and INSERT permission +GRANT INSERT ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; -- should fail +ROLLBACK; +BEGIN; +LOCK TABLE lock_table IN ACCESS EXCLUSIVE MODE; -- should fail +ROLLBACK; +\c +REVOKE INSERT ON lock_table FROM locktable_user; + +-- LOCK TABLE and UPDATE permission +GRANT UPDATE ON lock_table TO locktable_user; +SET SESSION AUTHORIZATION locktable_user; +BEGIN; +LOCK TABLE lock_table IN ROW EXCLUSIVE MODE; -- should pass +COMMIT; +BEGIN; +LOCK TABLE lock_table IN ACCESS SHARE MODE; --
Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
On Tue, May 12, 2015 at 5:50 AM, Andrew Dunstan and...@dunslane.net wrote: On 05/11/2015 02:02 AM, Amit Kapila wrote: On Sun, May 10, 2015 at 6:01 AM, Andrew Dunstan and...@dunslane.net mailto:and...@dunslane.net wrote: This generally looks good, but I have a couple of questions before I commit it. First, why is the new option for the BASE_BACKUP command of the Streaming Replication protcol TAR? It seems rather misleading. Shouldn't it be something like TABLESPACEMAP? The reason to keep new option's name as TAR was that tablespace_map was generated for that format type, but I agree with you that something like TABLESPACEMAP suits better, so I have changed it to TABLESPACE_MAP. Putting '_' in name makes it somewhat consistent with other names and filename it generates with this new option. Second, these lines in xlog.c seem wrong: else if ((ch == '\n' || ch == '\r') prev_ch == '\\') str[i-1] = '\n'; It looks to me like we should be putting ch in the string, not arbitrarily transforming \r into \n. You are right, I have changed it as per your suggestion. OK, I have cleaned this up a bit - I had already started so I didn't take your latest patch but instead applied relevant changes to my changeset. Here is my latest version. In testing I notice that now pg_baseback -F t leaves it completely up to the user on all platforms to create the relevant links in pg_tblspc/. It includes the tablespace_map file in base.tar, but that's really just informational. Sorry, but I am not able to follow your point. User don't need to create the relevant links, they will get created during first startup (during recovery) from the backup. I have tested and it works both on Windows and Linux. Refer below code of patch in xlog.c + + /* read the tablespace_map file if present and create symlinks. */ + if (read_tablespace_map(tablespaces)) + { .. I think we need to add something to the pg_basebackup docs about that, at the very least (and it will also need to be a release note item.) Currently, below lines in patch suggest that this file is required for recovery. Do you expect more information to be added? +These files are not merely for your information; their presence and +contents are critical to the proper operation of the system's recovery +process. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_upgrade: quote directory names in delete_old_cluster script
On Wed, Apr 29, 2015 at 10:52:45PM -0400, Bruce Momjian wrote: On Mon, Feb 16, 2015 at 05:03:45PM -0500, Bruce Momjian wrote: All of our makefiles use single quotes, so effectively the only character we don't support in directory paths is the single quote itself. This seems to say that Windows batch files don't have any special meaning for single quotes: http://stackoverflow.com/questions/24173825/what-does-single-quote-do-in-windows-batch-files http://stackoverflow.com/questions/10737283/single-quotes-and-double-quotes-how-to-have-the-same-behaviour-in-unix-and-wind Again, is it worth doing something platform-specific? I can certainly define a platform-specific macro for or ' as and use that. Good idea? I have developed the attached patch to use platform-specific quoting of path names. Applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multixacts woes
On 05/11/2015 09:54 AM, Robert Haas wrote: OK, I have made this change. Barring further trouble reports, this completes the multixact work I plan to do for the next release. Here is what is outstanding: 1. We might want to introduce a GUC to control the point at which member offset utilization begins clamping autovacuum_multixact_freeze_max_age. It doesn't seem wise to do anything about this before pushing a minor release out. It's not entirely trivial, and it may be helpful to learn more about how the changes already made work out in practice before proceeding. Also, we might not back-patch this anyway. -1 on back-patching a new GUC. People don't know what to do with the existing multixact GUCs, and without an age(multixact) function built-in, any adjustments a user tries to make are likely to do more harm than good. In terms of adding a new GUC in 9.5: can't we take a stab at auto-tuning this instead of adding a new GUC? We already have a bunch of freezing GUCs which fewer than 1% of our user base has any idea how to set. 2. The recent changes adjust things - for good reason - so that the safe threshold for multixact member creation is advanced only at checkpoint time. This means it's theoretically possible to have a situation where autovacuum has done all it can, but because no checkpoint has happened yet, the user can't create any more multixacts. Thanks to some good work by Thomas, autovacuum will realize this and avoid spinning uselessly over every table in the system, which is good, but you're still stuck with errors until the next checkpoint. Essentially, we're hoping that autovacuum will clean things up far enough in advance of hitting the threshold where we have to throw an error that a checkpoint will intervene before the error starts happening. It's possible we could improve this further, but I think it would be unwise to mess with it right now. It may be that there is no real-world problem here. Given that our longest possible checkpoint timeout is an hour, is it even hypotethically possible that we would hit a limit in that time? How many mxact members are we talking about? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multixacts woes
Josh Berkus wrote: In terms of adding a new GUC in 9.5: can't we take a stab at auto-tuning this instead of adding a new GUC? We already have a bunch of freezing GUCs which fewer than 1% of our user base has any idea how to set. If you have development resources to pour onto 9.5, I think it would be better spent changing multixact usage tracking so that oldestOffset is included in pg_control; also make pg_multixact truncation be WAL-logged. With those changes, the need for a lot of pretty complicated code would go away. The fact that truncation is done by both vacuum and checkpoint causes a lot of the mess we were in (and from which Robert and Thomas took us --- thanks guys!). Such a change is the first step towards auto-tuning, I think. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multixacts woes
Robert Haas wrote: OK, I have made this change. Barring further trouble reports, this completes the multixact work I plan to do for the next release. Many thanks for all the effort here -- much appreciated. 2. The recent changes adjust things - for good reason - so that the safe threshold for multixact member creation is advanced only at checkpoint time. This means it's theoretically possible to have a situation where autovacuum has done all it can, but because no checkpoint has happened yet, the user can't create any more multixacts. Thanks to some good work by Thomas, autovacuum will realize this and avoid spinning uselessly over every table in the system, which is good, but you're still stuck with errors until the next checkpoint. Essentially, we're hoping that autovacuum will clean things up far enough in advance of hitting the threshold where we have to throw an error that a checkpoint will intervene before the error starts happening. It's possible we could improve this further, but I think it would be unwise to mess with it right now. It may be that there is no real-world problem here. See my response to Josh. I think much of the current rube-goldbergian design is due to the fact that pg_control cannot be changed in back branches. Going forward, I think a better plan is to include more info in pg_control, WAL-log more operations, remove checkpoint from the loop and have everything happen at vacuum time. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use outerPlanState() consistently in executor code
On Mon, May 4, 2015 at 1:23 PM, Robert Haas robertmh...@gmail.com wrote: I fixed several whitespace errors, reverted the permissions changes you included Sorry about the permission changes - didn't notice that bite. Thanks, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #13148: Unexpected deferred EXCLUDE constraint violation on derived table
Hi, On 2015-05-10 16:01:53 -0400, Tom Lane wrote: The cause of the problem seems to be that the UPDATE performs a HOT update of the new tuple, leaving in this case a dead tuple at (0,2) that is HOT updated by (0,3). When unique_key_recheck() is invoked for (0,2), it believes, correctly, that it has to perform the recheck anyway ... but it tells check_exclusion_constraint that the check is being performed for (0,2). So the index search inside check_exclusion_constraint finds the live tuple at (0,3) and thinks that is a conflict. Heh, it's curious that this wasn't found up until now. I also wonder if this might be related to the spurious violations Peter G. has been observing... The attached patch seems to fix the problem without breaking any existing regression tests, but I wonder if anyone can see a hole in it. Looks good to me. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multixacts woes
On Mon, May 11, 2015 at 10:11 AM, Noah Misch n...@leadboat.com wrote: On Mon, May 11, 2015 at 08:29:05AM -0400, Robert Haas wrote: Given your concerns, and the need to get a fix for this out the door quickly, what I'm inclined to do for the present is go bump the threshold from 25% of MaxMultiXact to 50% of MaxMultiXact without changing anything else. +1 Your analysis shows that this is more in line with the existing policy for multixact IDs than what I did, and it will reduce the threat of frequent wraparound scans. Now, it will also increase the chances of somebody hitting the wall before autovacuum can bail them out. But maybe not that much. If we need 75% of the multixact member space to complete one cycle of anti-wraparound vacuums, we're actually very close to the point where the system just cannot work. If that's one big table, we're done. Agreed. OK, I have made this change. Barring further trouble reports, this completes the multixact work I plan to do for the next release. Here is what is outstanding: 1. We might want to introduce a GUC to control the point at which member offset utilization begins clamping autovacuum_multixact_freeze_max_age. It doesn't seem wise to do anything about this before pushing a minor release out. It's not entirely trivial, and it may be helpful to learn more about how the changes already made work out in practice before proceeding. Also, we might not back-patch this anyway. 2. The recent changes adjust things - for good reason - so that the safe threshold for multixact member creation is advanced only at checkpoint time. This means it's theoretically possible to have a situation where autovacuum has done all it can, but because no checkpoint has happened yet, the user can't create any more multixacts. Thanks to some good work by Thomas, autovacuum will realize this and avoid spinning uselessly over every table in the system, which is good, but you're still stuck with errors until the next checkpoint. Essentially, we're hoping that autovacuum will clean things up far enough in advance of hitting the threshold where we have to throw an error that a checkpoint will intervene before the error starts happening. It's possible we could improve this further, but I think it would be unwise to mess with it right now. It may be that there is no real-world problem here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)
Hi, On 2015-05-11 16:57:08 +0530, Abhijit Menon-Sen wrote: I've attached an updated patch for pgstatbloat, as well as a patch to replace two uses of BuildTupleFromCStrings() elsewhere in pgstattuple. TBH, I'd rather not touch unrelated things right now. We're pretty badly behind... I've made the changes you mentioned in your earlier mail, except that I have not changed the name pending further suggestions about what would be the best name. I don't really care how it's named, as long as it makes clear that it's not an exact measurement. I haven't checked, but I'm not sure that it's safe/meaningful to call PageGetHeapFreeSpace() on a new page. OK, I'll check and fix if necessary. You're right, PageGetHeapFreeSpace() isn't safe on a new page. I've added a guard to that call in the attached patch, but I'm not sure that's the right thing to do. Should I copy the orphaned new-page handling from lazy_scan_heap? What about for empty pages? Neither feels like a very good thing to do, but then neither does skipping the new page silently. Should I count the space it would have free if it were initialised, but leave the page alone for VACUUM to deal with? Or just leave it as it is? I think there's absolutely no need for pgstattuple to do anything here. It's not even desirable. + LockBuffer(buf, BUFFER_LOCK_SHARE); + + page = BufferGetPage(buf); + + if (!PageIsNew(page)) + stat-free_space += PageGetHeapFreeSpace(page); + + if (PageIsNew(page) || PageIsEmpty(page)) + { + UnlockReleaseBuffer(buf); + continue; + } Shouldn't new pages be counted as being fully free or at least bloat? Just disregarding them seems wrong. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication and WAL archive interactions
On 05/08/2015 04:21 PM, Heikki Linnakangas wrote: On 04/22/2015 10:07 AM, Michael Paquier wrote: On Wed, Apr 22, 2015 at 3:38 PM, Heikki Linnakangas hlinn...@iki.fi wrote: I feel that the best approach is to archive the last, partial segment, but with the .partial suffix. I don't see any plausible real-wold setup where the current behavior would be better. I don't really see much need to archive the partial segment at all, but there's also no harm in doing it, as long as it's clearly marked with the .partial suffix. Well, as long as it is clearly archived at promotion, even with a suffix, I guess that I am fine... This will need some tweaking on restore_command for existing applications, but as long as it is clearly documented I am fine. Shouldn't this be a different patch though? Ok, I came up with the attached, which adds the .partial suffix to the partial WAL segment that's archived after promotion. I couldn't find any natural place to talk about it in the docs, though. I think after the docs changes from the main patch are applied, it would be natural to mention this in the Continuous archiving in standby, so I think I'll add that later. Barring objections, I'll push this later tonight. Applied that part. Now that we got this last-partial-segment problem out of the way, I'm going to try fixing the problem you (Michael) pointed out about relying on pgstat file. Meanwhile, I'd love to get more feedback on the rest of the patch, and the documentation. And here is a new version of the patch. I kept the approach of using pgstat, but it now only polls pgstat every 10 seconds, and doesn't block to wait for updated stats. - Heikki From 08ca3cc7b9824503b793e149247ea9c6d3a7f323 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas heikki.linnakangas@iki.fi Date: Thu, 16 Apr 2015 14:40:24 +0300 Subject: [PATCH v3 1/1] Make WAL archival behave more sensibly in standby mode. This adds two new archive_modes, 'shared' and 'always', to indicate whether the WAL archive is shared between the primary and standby, or not. In shared mode, the standby tracks which files have been archived by the primary. The standby refrains from recycling files that the primary has not yet archived, and at failover, the standby archives all those files too from the old timeline. In 'always' mode, the standby's WAL archive is taken to be separate from the primary's, and the standby independently archives all files it receives from the primary. This adds a new archival status message to the protocol. WAL sender sends one automatically, when the last archived WAL file, as reported in pgstat, changes. (Or rather, some time after it changes. We're not in a hurry, the standby doesn't need an up-to-the-second status) Fujii Masao and me. --- doc/src/sgml/config.sgml | 12 +- doc/src/sgml/high-availability.sgml | 48 +++ doc/src/sgml/protocol.sgml| 31 + src/backend/access/transam/xlog.c | 29 +++- src/backend/postmaster/pgstat.c | 44 ++ src/backend/postmaster/postmaster.c | 37 +++-- src/backend/replication/walreceiver.c | 172 +++- src/backend/replication/walsender.c | 186 ++ src/backend/utils/misc/guc.c | 21 +-- src/backend/utils/misc/postgresql.conf.sample | 2 +- src/include/access/xlog.h | 14 +- src/include/pgstat.h | 2 + 12 files changed, 513 insertions(+), 85 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 0d8624a..ac845e0 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2521,7 +2521,7 @@ include_dir 'conf.d' variablelist varlistentry id=guc-archive-mode xreflabel=archive_mode - termvarnamearchive_mode/varname (typeboolean/type) + termvarnamearchive_mode/varname (typeenum/type) indexterm primaryvarnamearchive_mode/ configuration parameter/primary /indexterm @@ -2530,7 +2530,15 @@ include_dir 'conf.d' para When varnamearchive_mode/ is enabled, completed WAL segments are sent to archive storage by setting -xref linkend=guc-archive-command. +xref linkend=guc-archive-command. In addition to literaloff/, +to disable, there are three modes: literalon/, literalshared/, +and literalalways/. During normal operation, there is no +difference between the three modes, but in archive recovery or +standby mode, it indicates whether the WAL archive is shared between +the primary and the standby server or not. See +xref linkend=continuous-archiving-in-standby for details. + /para + para varnamearchive_mode/ and varnamearchive_command/ are separate variables so that varnamearchive_command/ can be changed without leaving archiving mode. diff --git