Re: [HACKERS] Function array_agg(array)
2014-10-25 8:33 GMT+02:00 Pavel Stehule : > > > 2014-10-25 8:19 GMT+02:00 Ali Akbar : > >> 2014-10-25 10:29 GMT+07:00 Ali Akbar : >> >>> I fixed small issue in regress tests and I enhanced tests for varlena types and null values. >>> >>> Thanks. >>> >>> it is about 15% faster than original implementation. >>> >>> 15% faster than array_agg(scalar)? I haven't verify the performance, but >>> because the internal array data and null bitmap is copied as-is, that will >>> be faster. >>> >>> 2014-10-25 1:51 GMT+07:00 Pavel Stehule : >>> Hi Ali >>> I checked a code. I am thinking so code organization is not good. accumArrayResult is too long now. makeMdArrayResult will not work, when arrays was joined (it is not consistent now). I don't like a usage of state->is_array_accum in array_userfunc.c -- it is signal of wrong wrapping. >>> Yes, i was thinking the same. Attached WIP patch to reorganizate the >>> code. makeMdArrayResult works now, with supplied arguments act as override >>> from default values calculated from ArrayBuildStateArray. >>> >>> In array_userfunc.c, because we don't want to override ndims, dims and >>> lbs, i copied array_agg_finalfn and only change the call to >>> makeMdArrayResult (we don't uses makeArrayResult because we want to set >>> release to false). Another alternative is to create new >>> makeArrayResult-like function that has parameter bool release. >>> >> >> adding makeArrayResult1 (do you have better name for this?), that accepts >> bool release parameter. array_agg_finalfn becomes more clean, and no >> duplicate code in array_agg_anyarray_finalfn. >> > > makeArrayResult1 - I have no better name now > > I found one next minor detail. > > you reuse a array_agg_transfn function. Inside is a message > "array_agg_transfn called in non-aggregate context". It is not correct for > array_agg_anyarray_transfn > > probably specification dim and lbs in array_agg_finalfn is useless, when > you expect a natural result. A automatic similar to > array_agg_anyarray_finalfn should work there too. > > makeMdArrayResultArray and appendArrayDatum should be marked as "static" > I am thinking so change of makeArrayResult is wrong. It is developed for 1D array. When we expect somewhere ND result now, then we should to use makeMdArrayResult there. So makeArrayResult should to return 1D array in all cases. Else a difference between makeArrayResult and makeMdArrayResult hasn't big change and we have a problem with naming. Regards Pavel > > >> >>> next question: there is function array_append(anyarray, anyelement). Isn't time to define array_append(anyarray, anyarray) now? >>> >>> There is array_cat(anyarray, anyarray): >>> >>> /*- >>> * array_cat : >>> * concatenate two nD arrays to form an nD array, or >>> * push an (n-1)D array onto the end of an nD array >>> >>> >>> * >>> */ >>> >> >> Regards, >> -- >> Ali Akbar >> > >
Re: [HACKERS] Function array_agg(array)
2014-10-25 8:19 GMT+02:00 Ali Akbar : > 2014-10-25 10:29 GMT+07:00 Ali Akbar : > >> I fixed small issue in regress tests and I enhanced tests for varlena >>> types and null values. >> >> Thanks. >> >> it is about 15% faster than original implementation. >> >> 15% faster than array_agg(scalar)? I haven't verify the performance, but >> because the internal array data and null bitmap is copied as-is, that will >> be faster. >> >> 2014-10-25 1:51 GMT+07:00 Pavel Stehule : >> >>> Hi Ali >>> >> >>> I checked a code. I am thinking so code organization is not good. >>> accumArrayResult is too long now. makeMdArrayResult will not work, when >>> arrays was joined (it is not consistent now). I don't like a usage of >>> state->is_array_accum in array_userfunc.c -- it is signal of wrong >>> wrapping. >>> >> Yes, i was thinking the same. Attached WIP patch to reorganizate the >> code. makeMdArrayResult works now, with supplied arguments act as override >> from default values calculated from ArrayBuildStateArray. >> >> In array_userfunc.c, because we don't want to override ndims, dims and >> lbs, i copied array_agg_finalfn and only change the call to >> makeMdArrayResult (we don't uses makeArrayResult because we want to set >> release to false). Another alternative is to create new >> makeArrayResult-like function that has parameter bool release. >> > > adding makeArrayResult1 (do you have better name for this?), that accepts > bool release parameter. array_agg_finalfn becomes more clean, and no > duplicate code in array_agg_anyarray_finalfn. > makeArrayResult1 - I have no better name now I found one next minor detail. you reuse a array_agg_transfn function. Inside is a message "array_agg_transfn called in non-aggregate context". It is not correct for array_agg_anyarray_transfn probably specification dim and lbs in array_agg_finalfn is useless, when you expect a natural result. A automatic similar to array_agg_anyarray_finalfn should work there too. makeMdArrayResultArray and appendArrayDatum should be marked as "static" > >> next question: there is function array_append(anyarray, anyelement). >>> Isn't time to define array_append(anyarray, anyarray) now? >>> >> >> There is array_cat(anyarray, anyarray): >> >> /*- >> * array_cat : >> * concatenate two nD arrays to form an nD array, or >> * push an (n-1)D array onto the end of an nD array >> >> >> * >> */ >> > > Regards, > -- > Ali Akbar >
Re: [HACKERS] Function array_agg(array)
2014-10-25 10:29 GMT+07:00 Ali Akbar : > I fixed small issue in regress tests and I enhanced tests for varlena >> types and null values. > > Thanks. > > it is about 15% faster than original implementation. > > 15% faster than array_agg(scalar)? I haven't verify the performance, but > because the internal array data and null bitmap is copied as-is, that will > be faster. > > 2014-10-25 1:51 GMT+07:00 Pavel Stehule : > >> Hi Ali >> > >> I checked a code. I am thinking so code organization is not good. >> accumArrayResult is too long now. makeMdArrayResult will not work, when >> arrays was joined (it is not consistent now). I don't like a usage of >> state->is_array_accum in array_userfunc.c -- it is signal of wrong >> wrapping. >> > Yes, i was thinking the same. Attached WIP patch to reorganizate the code. > makeMdArrayResult works now, with supplied arguments act as override from > default values calculated from ArrayBuildStateArray. > > In array_userfunc.c, because we don't want to override ndims, dims and > lbs, i copied array_agg_finalfn and only change the call to > makeMdArrayResult (we don't uses makeArrayResult because we want to set > release to false). Another alternative is to create new > makeArrayResult-like function that has parameter bool release. > adding makeArrayResult1 (do you have better name for this?), that accepts bool release parameter. array_agg_finalfn becomes more clean, and no duplicate code in array_agg_anyarray_finalfn. > next question: there is function array_append(anyarray, anyelement). Isn't >> time to define array_append(anyarray, anyarray) now? >> > > There is array_cat(anyarray, anyarray): > > /*- > * array_cat : > * concatenate two nD arrays to form an nD array, or > * push an (n-1)D array onto the end of an nD array > > * > */ > Regards, -- Ali Akbar diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7e5bcd9..f59738a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -12046,6 +12046,22 @@ NULL baz(3 rows) +array_agg + + array_agg(anyarray) + + + any + + + the same array type as input type + + input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input. + + + + + average diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 2f0680f..8c182a4 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%'); array --- {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413} + +SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i)); + array +--- + {{1},{2},{3},{4},{5}} (1 row) The subquery must return a single column. The resulting diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 41e973b..0261fcb 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -108,12 +108,16 @@ exprType(const Node *expr) type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { - type = get_array_type(type); - if (!OidIsValid(type)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("could not find array type for data type %s", - format_type_be(exprType((Node *) tent->expr); + if (!OidIsValid(get_element_type(type))) + { + /* not array, so check for its array type */ + type = get_array_type(type); + if (!OidIsValid(type)) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("could not find array type for data type %s", +format_type_be(exprType((Node *) tent->expr); + } } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) @@ -139,12 +143,16 @@ exprType(const Node *expr) type = subplan->firstColType; if (subplan->subLinkType == ARRAY_SUBLINK) { - type = get_array_type(type); - if (!OidIsValid(type)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("could not find array type for data type %s", - format_type_be(subplan->firstColType; + if (!OidIsValid(get_element_type(type))) + { + /* not array, so check for its array type */ + type = get_array_type(type); + if (!OidIsValid(type)) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("could not find array type for data type %s", + format_type_be(subplan->firstColType; + } } } else i
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
On Fri, Oct 24, 2014 at 4:05 PM, Andres Freund wrote: > > On 2014-10-24 15:59:30 +0530, Amit Kapila wrote: > > > > and w.r.t performance it can lead extra > > > > function call, few checks and I think in some cases even can > > > > acquire/release spinlock. > > > > > > I fail to see how that could be the case. > > > > Won't it happen incase first backend sets releaseOK to true and another > > backend which tries to wakeup waiters on lock will acquire spinlock > > and tries to release the waiters. > > Sure, that can happen. > > > And again, this is code that's > > > only executed around a couple syscalls. And the cacheline will be > > > touched around there *anyway*. > > > > Sure, but I think syscalls are required in case we need to wake any > > waiter. > > It won't wake up a waiter if there's none on the list. Yeap, but still it will acquire/release spinlock. > > > > > And it'd be a pretty pointless > > > > > behaviour, leading to useless increased contention. The only time it'd > > > > > make sense for X to be woken up is when it gets run faster than the S > > > > > processes. > > > > > > > > Do we get any major benefit by changing the logic of waking up waiters? > > > > > > Yes. > > > > I think one downside I could see of new strategy is that the chance of > > Exclusive waiter to take more time before getting woked up is increased > > as now it will by pass Exclusive waiters in queue. > > Note that that *already* happens for any *new* shared locker that comes > in. It doesn't really make sense to have share lockers queued behind the > exclusive locker if others just go in front of it anyway. Yeah, but I think it is difficult to avoid that behaviour as even when it wakes Exclusive locker, some new shared locker can comes in and acquire the lock before Exclusive locker. I think it is difficult to say what is the best waking strategy, as priority for Exclusive lockers is not clearly defined incase of LWLocks. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Function array_agg(array)
> > I fixed small issue in regress tests and I enhanced tests for varlena > types and null values. Thanks. it is about 15% faster than original implementation. 15% faster than array_agg(scalar)? I haven't verify the performance, but because the internal array data and null bitmap is copied as-is, that will be faster. 2014-10-25 1:51 GMT+07:00 Pavel Stehule : > Hi Ali > > I checked a code. I am thinking so code organization is not good. > accumArrayResult is too long now. makeMdArrayResult will not work, when > arrays was joined (it is not consistent now). I don't like a usage of > state->is_array_accum in array_userfunc.c -- it is signal of wrong > wrapping. > Yes, i was thinking the same. Attached WIP patch to reorganizate the code. makeMdArrayResult works now, with supplied arguments act as override from default values calculated from ArrayBuildStateArray. In array_userfunc.c, because we don't want to override ndims, dims and lbs, i copied array_agg_finalfn and only change the call to makeMdArrayResult (we don't uses makeArrayResult because we want to set release to false). Another alternative is to create new makeArrayResult-like function that has parameter bool release. > next question: there is function array_append(anyarray, anyelement). Isn't > time to define array_append(anyarray, anyarray) now? > There is array_cat(anyarray, anyarray): /*- * array_cat : * concatenate two nD arrays to form an nD array, or * push an (n-1)D array onto the end of an nD array * */ Regards, -- Ali Akbar diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7e5bcd9..f59738a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -12046,6 +12046,22 @@ NULL baz(3 rows) +array_agg + + array_agg(anyarray) + + + any + + + the same array type as input type + + input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input. + + + + + average diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 2f0680f..8c182a4 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%'); array --- {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413} + +SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i)); + array +--- + {{1},{2},{3},{4},{5}} (1 row) The subquery must return a single column. The resulting diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 41e973b..0261fcb 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -108,12 +108,16 @@ exprType(const Node *expr) type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { - type = get_array_type(type); - if (!OidIsValid(type)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("could not find array type for data type %s", - format_type_be(exprType((Node *) tent->expr); + if (!OidIsValid(get_element_type(type))) + { + /* not array, so check for its array type */ + type = get_array_type(type); + if (!OidIsValid(type)) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("could not find array type for data type %s", +format_type_be(exprType((Node *) tent->expr); + } } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) @@ -139,12 +143,16 @@ exprType(const Node *expr) type = subplan->firstColType; if (subplan->subLinkType == ARRAY_SUBLINK) { - type = get_array_type(type); - if (!OidIsValid(type)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("could not find array type for data type %s", - format_type_be(subplan->firstColType; + if (!OidIsValid(get_element_type(type))) + { + /* not array, so check for its array type */ + type = get_array_type(type); + if (!OidIsValid(type)) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("could not find array type for data type %s", + format_type_be(subplan->firstColType; + } } } else if (subplan->subLinkType == MULTIEXPR_SUBLINK) diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c index 3e7dc85..8fc8b49 100644 --- a/src/backend/optimizer/plan/subselect.c +++ b/src/backend/optimizer/plan/subselect.c @@ -668,10 +668,16 @@ build_subplan(Pla
Re: [HACKERS] How ugly would this be? (ALTER DATABASE)
On Fri, Oct 24, 2014 at 7:37 PM, Jim Nasby wrote: > ISTM that the multiple-databases-per-backend issue is the huge hang-up here. > Maybe there's some way that could be hacked around if you're just > re-jiggering a bunch of catalog stuff (assuming you lock users out of both > databases while you're doing that), but if you were going to go to that > extent perhaps it'd be better to just support cross-database access in a > single backend... Good luck with that. It's probably as hard or harder than making the backend multi-threaded, which is itself harder than any project a reasonable person will undertake any time in the forseeable future. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security barrier INSERT
Drew, > IMHO this is nonintuitive, the intuitive behavior of a security_barrier > view should be to forbid inserting rows that can’t appear in the view. > Isn't that what WITH CHECK OPTION is meant to accomplish? -Adam -- Adam Brightwell - adam.brightw...@crunchydatasolutions.com Database Engineer - www.crunchydatasolutions.com
[HACKERS] security barrier INSERT
Hi guys, I’m seeing some non-intuitive behavior with the new updateable security barrier views in 9.4. Below is the behavior of 9.4b3: > =# create table widget ( id integer); > CREATE TABLE > =# create view widget_sb WITH (security_barrier=true) AS SELECT * from widget > where id = 22; > CREATE VIEW > =# insert into widget_sb (id) values (23); > INSERT 0 1 > =# select * from widget; > id > > 23 > (1 row) I think the insert should fail, since the view can only contain widgets with id = 22, and the widget being inserted has id 23. In reality, the insert to the behind table succeeds as shown above, although it still remains absent from the view: > =# select * from widget_sb; > id > > (0 rows) IMHO this is nonintuitive, the intuitive behavior of a security_barrier view should be to forbid inserting rows that can’t appear in the view. Have I missed something fundamental here? Drew
Re: [HACKERS] Comment in outfunc.c
Tatsuo Ishii writes: > I saw this comment in _outQuery() in outfuncs.c: > WRITE_ENUM_FIELD(commandType, CmdType); > WRITE_ENUM_FIELD(querySource, QuerySource); > /* we intentionally do not print the queryId field */ > WRITE_BOOL_FIELD(canSetTag); > What is the resoning behind the comment? We don't want the queryId copied when a view/rule is reloaded, since there's no guarantee that the algorithm for computing it is stable for the life of a PG major version. Instead it's reset to zero on reload, and any plugins that care can recompute it. I suppose we could print it anyway and then ignore it on reload but what's the point of that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Comment in outfunc.c
Hi, I saw this comment in _outQuery() in outfuncs.c: WRITE_ENUM_FIELD(commandType, CmdType); WRITE_ENUM_FIELD(querySource, QuerySource); /* we intentionally do not print the queryId field */ WRITE_BOOL_FIELD(canSetTag); What is the resoning behind the comment? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Comment patch for bgworker.c
The comment for the BackgroundWorkerSlot structure tripped me up reviewing Robert's background worker patch; it made it clear that you need to use a memory barrier before setting in_use, but normally you'd never need to worry about that because RegisterDynamicBackgroundWorker() handles it for you. Patch adds a comment to that effect. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c index 85a3b3a..e81dbc1 100644 --- a/src/backend/postmaster/bgworker.c +++ b/src/backend/postmaster/bgworker.c @@ -55,7 +55,8 @@ slist_headBackgroundWorkerList = SLIST_STATIC_INIT(BackgroundWorkerList); * responsibility of the postmaster. Regular backends may no longer modify it, * but the postmaster may examine it. Thus, a backend initializing a slot * must fully initialize the slot - and insert a write memory barrier - before - * marking it as in use. + * marking it as in use. Note that RegisterDynamicBackgroundWorker() handles + * in_use correctly for you. * * As an exception, however, even when the slot is in use, regular backends * may set the 'terminate' flag for a slot, telling the postmaster not -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
On 10/23/14, 6:23 PM, Alvaro Herrera wrote: Brightwell, Adam wrote: If we were to make it consistent and use the old wording, what do you think about providing an "errhint" as well? Perhaps for example in slotfuncs.c#pg_create_physical_replication_stot: errmsg - "permission denied to create physical replication slot" errhint - "You must be superuser or replication role to use replication slots." Sure. As I started looking at this, there are multiple other places where these types of error messages occur (opclasscmds.c, user.c, postinit.c, miscinit.c are just a few), not just around the changes in this patch. If we change them in one place, wouldn't it be best to change them in the rest? If that is the case, I'm afraid that might distract from the purpose of this patch. Perhaps, if we want to change them, then that should be submitted as a separate patch? Yeah. I'm just saying that maybe this patch should adopt whatever wording we agree to, not that we need to change other places. On the other hand, since so many other places have adopted the different wording, maybe there's a reason for it and if so, does anybody know what it is. But I have to say that it does look inconsistent to me. Keep in mind that originally the ONLY special permissions "thing" we had was rolsuper. Now we also have replication, maybe some others, and there's discussion of adding a special "backup role". In other words, the situation is a lot more complex than it used to be. So if cleanup is done here, it would be best to try and account for this new stuff. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ExclusiveLock on extension of relation with huge shared_buffers
Please don't top-post. On 10/24/14, 3:40 AM, Borodin Vladimir wrote: I have taken some backtraces (they are attached to the letter) of two processes with such command: pid=17981; while true; do date; gdb -batch -e back /usr/pgsql-9.4/bin/postgres $pid; echo; echo; echo; echo; sleep 0.1; done Process 17981 was holding the lock for a long time - http://pastie.org/9671931. And process 13886 was waiting for lock (in different time and from different blocker actually but I don’t think it is really important) - http://pastie.org/9671939. As I can see, 17981 is actually waiting for LWLock on BufFreelistLock in StrategyGetBuffer function, freelist.c:134 while holding exclusive lock on relation. I will try to increase NUM_BUFFER_PARTITIONS (on read-only load it also gave us some performance boost) and write the result in this thread. BufFreelistLock becomes very contended when shared buffers are under a lot of pressure. Here's what I believe is happening: If RelationGetBufferForTuple() decides it needs to extend, this happens: LockRelationForExtension(relation, ExclusiveLock); buffer = ReadBufferBI(relation, P_NEW, bistate); Assuming bistate is false (I didn't check the bulk case), ReadBufferBI() ends up at ReadBuffer_common(), which calls BufferAlloc(). In the normal case, BufferAlloc() won't find the necessary buffer, so it will call StrategyGetBuffer(), which will end up getting the freelist lock. Currently the free list is normally empty, which means we now need to run the clock sweep to find a victim buffer. The clock sweep will keep running until it finds a buffer that is not pinned and has usage_count = 0. If shared buffers are under heavy pressure, you can have a huge number of them with usage_count = 5, which for 100GB shared buffers and an 8K BLKSZ, you could have to check buffers *52 million* times (assuming you finally find a buffer on the start of the 5th loop) before you find a victim. Keep in mind that's all happening while you're holding both the extension lock *and the freelist lock*, which basically means no one else in the entire system can allocate a new buffer. This is one reason why a large shared_buffers setting is usually counter-productive. Experience with older versions is that setting it higher than about 8GB is more likely to hurt than to help. Newer versions are probably better, but I think you'll be hard-pressed to find a workload where 100GB makes sense. It might if your entire database fits in shared_buffers (though, even then there's probably a number of O(n) or worse operations that will hurt you), but if your database is > shared_buffers you're probably in trouble. I suggest cutting shared_buffers *way* down. Old-school advice for this machine would be 8G (since 25% of 128G would be too big). You might be able to do better than 8G, but I recommend not even trying unless you've got a good way to test your performance. If you can test performance and find an optimal setting for shared_buffers, please do share your test data and findings. :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Oct 24, 2014 at 4:39 PM, Petr Jelinek wrote: > If you feel so strongly that it's wrong even though everybody else seems to > prefer it and if you at the same time feel so strongly about people changing > minds once you implement this, maybe the best way to convince us is to show > us the implementation (at this point it would probably have taken less of > your time than the argument did). No, it wouldn't have - I don't think anyone believes that. Magic addRangeTableEntryForRelation() calls are only used in the context of one or two utility statements that have pretty limited scope. Support for an OLD.* style syntax would have to exist at *all* stages of query execution, from parse analysis through to rewriting, planning, and execution. That's the difference here - this isn't a utility command. >> So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe >> that's an interesting precedent. During rewriting, this gets rewritten >> such that you end up with something that looks to the planner as if >> the original query included a constant (this actually comes from a >> catalog look-up for the column during rewriting). What if we spelled >> EXCLUDING/CONFLICTING as follows: >> >> INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val = >> EXCLUDED || 'this works' WHERE another_col != EXCLUDED; >> >> Then rewriting would figure these details out. From a design >> perspective, there'd need to be a few details worked out about how >> inference actually works - inferring *which* column the EXCLUDED >> expression actually referred to, but it seems doable, especially given >> the existing restrictions on the structure of the UPDATE. We're not >> rewriting from a SetToDefault to a constant, but a SetToDefault-like >> thing to a special Var (actually, the finished representation probably >> makes it to the execution stage with that Var representation filled >> in, unlike SetToDefault, but it's basically the same pattern). It >> solves my problem with dummy range table entries. Actually, *any* new >> kind of expression accomplishes this just as well. My concern here is >> more around not needing cute tricks with dummy RTEs than it is around >> being in favor of any particular expression-based syntax. >> >> What do you think of that? >> > > Ugh, you want to auto-magically detect what value is behind the EXCLUDED > based on how/where it's used in the UPDATE? That seems like quite a bad > idea. That's *exactly* how DEFAULT works within UPDATE targetlists. There might be a few more details to work out here, but not terribly many, and that's going to be true no matter what. 95%+ of the time, it'll just be "val = EXCLUDED" anyway. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On 25/10/14 00:48, Peter Geoghegan wrote: You're conflating the user-visible syntax with the parse tree representation in way that is utterly without foundation. I don't have a position at this point on which parse-analysis representation is preferable, but it's completely wrong-headed to say that you "can't" make NEW.x produce the same parse-analysis output that your CONFLICTING(x) syntax would have created. Sure, it might be harder, but it's not that much harder, and it's definitely not an unsolvable problem. I don't believe I did. The broader point is that the difficulty in making that work reflects the conceptually messiness, from user-visible aspects down. I can work with the difficulty, and I may even be able to live with the messiness, but I'm trying to bring the problems with it to a head sooner rather than later for good practical reasons. In all sincerity, my real concern is that you or the others will change your mind when I actually go and implement an OLD.* style syntax, and see the gory details. I regret it if to ask this is to ask too much of you, but FYI that's the thought process behind it. If you feel so strongly that it's wrong even though everybody else seems to prefer it and if you at the same time feel so strongly about people changing minds once you implement this, maybe the best way to convince us is to show us the implementation (at this point it would probably have taken less of your time than the argument did). So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe that's an interesting precedent. During rewriting, this gets rewritten such that you end up with something that looks to the planner as if the original query included a constant (this actually comes from a catalog look-up for the column during rewriting). What if we spelled EXCLUDING/CONFLICTING as follows: INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val = EXCLUDED || 'this works' WHERE another_col != EXCLUDED; Then rewriting would figure these details out. From a design perspective, there'd need to be a few details worked out about how inference actually works - inferring *which* column the EXCLUDED expression actually referred to, but it seems doable, especially given the existing restrictions on the structure of the UPDATE. We're not rewriting from a SetToDefault to a constant, but a SetToDefault-like thing to a special Var (actually, the finished representation probably makes it to the execution stage with that Var representation filled in, unlike SetToDefault, but it's basically the same pattern). It solves my problem with dummy range table entries. Actually, *any* new kind of expression accomplishes this just as well. My concern here is more around not needing cute tricks with dummy RTEs than it is around being in favor of any particular expression-based syntax. What do you think of that? Ugh, you want to auto-magically detect what value is behind the EXCLUDED based on how/where it's used in the UPDATE? That seems like quite a bad idea. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How ugly would this be? (ALTER DATABASE)
On 10/24/14, 1:28 PM, Robert Haas wrote: On Fri, Oct 24, 2014 at 2:06 PM, Joshua D. Drake wrote: One of the things we run into quite a bit is customers who are using multiple databases when they should be using multiple schemas. I am sure other consultants run into this as well. This gets even more difficult as uptime requirements have become all but 100%. So my question is, what would this take? ALTER DATABASE foo LOCATION DATABASE bar SCHEMA baz? Where if we execute that command, database foo would move to schema baz within database bar? I am fully aware of what it takes on the client side but structurally within postgres what would it take? Is it even reasonable? What if the database contains more than one schema? You could perhaps try to create a command that would move a schema between two databases in the same cluster. It's fraught with practical difficulties because a single backend can't be connected to both databases at the same time, so how exactly do you make the required catalog changes all in a single transaction? But if you imagine that you have an infinite pool of top-notch PostgreSQL talent with unbounded time to work on this problem and no other, I bet somebody could engineer a solution. ISTM that the multiple-databases-per-backend issue is the huge hang-up here. Maybe there's some way that could be hacked around if you're just re-jiggering a bunch of catalog stuff (assuming you lock users out of both databases while you're doing that), but if you were going to go to that extent perhaps it'd be better to just support cross-database access in a single backend... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 24/10/14 23:07, Robert Haas wrote: On Fri, Oct 24, 2014 at 4:53 PM, Jim Nasby wrote: The only case I can think of would be actually connecting to a remote database; in that case would we even want something as raw as this? I suspect not, in which case I don't see an issue. On the other hand, if we ever think we might want to do that, we should probably at least stick a version number field in there... But my suspicion is if we ever wanted to do something more with this then we'd want some kind of text-based format that could be passed into a SQL command (ie: SET ENVIRONMENT TO blah;) I mean, I don't think this is much different than what we're already doing to transfer variables from the postmaster to other backends in EXEC_BACKEND builds; see write_nondefault_variables(). It doesn't have a version number or anything either. I don't see a reason why this code needs to be held to a different standard; the purpose is fundamentally quite similar. I really do think that the GUC patch is ok as it is, we might want to do 1/0 for bools but I don't really see that as important thing. And it works for the use-cases it's intended for. I don't see why this should be required to work cross version really. Loading of the modules by bgworker is probably bigger issue than just GUCs, and I think it's out of scope for the current patch(set). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 10/24/14, 4:03 PM, Robert Haas wrote: On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby wrote: On 10/24/14, 12:21 PM, Robert Haas wrote: - What should we call dsm_unkeep_mapping, if not that? Only option I can think of beyond unkeep would be dsm_(un)register_keep_mapping. Dunno that it's worth it. Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(), since it's arranging to keep it by unregistering it from the resource owner. And then we could call the new function dsm_register_mapping(). That has the appeal that "unregister" is a word, whereas "unkeep" isn't, but it's a little confusing otherwise, because the sense is reversed vs. the current naming. Or we could just leave dsm_keep_mapping() alone and call the new function dsm_register_mapping(). A little non-orthogonal, but I think it'd be OK. I think it's actually important to keep the names matching, even if it means "unkeep". dsm_unregister_mapping seems like the opposite of what you'd want to do to have the mapping be remembered. I get that it works that way because of how it's implemented, but that seems like a lot of leakage of implementation into API... FWIW, I don't see "unkeep" as being that horrible. - Does anyone have a tangible suggestion for how to reduce the code duplication in patch #6? Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in exec_simple that's not safe for bgwriter? I'm not seeing why we can't use exec_simple. :/ There are some differences if you compare them closely. Without doing a deep dive, I'm guessing that most of the extra stuff would be safe to re-use; it just wouldn't affect execute_sql_string. Obviously we could add a boolean to exec_simple_query for the case when it's being used by a bgwriter. Though, it seems like the biggest differences have to do with logging Here's the differences I see: - Disallowing transaction commands - Logging - What memory context is used (could we just use that differently in a pg_backend backend?) - Portal output format - What to do with the output of intermediate commands (surely there's other places we need to handle that? plpgsql maybe?) I think all of those except logging could be handled by a function serving both exec_simple_query and execute_sql_command that accepts a few booleans (or maybe just one to indicate the caller) and some if's. At least I don't think it'd be too bad, without actually writing it. I'm not sure what to do about logging. If the original backend has logging turned on, is it that horrible to do the same logging as exec_simple_query would do? Another option would be factoring out parts of exec_simple_query; the for loop over the parse tree might be a good candidate. But I suspect that would be uglier than a separate support function. I do feel uncomfortable with the amount of duplication there is right now though... BTW, it does occur to me that we could do something to combine AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo(). Meh. Well, it does seem to be a fairly common pattern throughout the codebase. And you were pretty vague when you asked for ideas to reduce duplication. ;P + default: + elog(WARNING, "unknown message type: %c (%zu bytes)", +msg.data[0], nbytes); It'd be useful to also provide DEBUG output with the message itself... I think that's more work than is justified. We'd have to hexdump it or something because it might not be valid in the client encoding, and it's a case that shouldn't happen anyway. I'm worried that a user wouldn't have any good way to see what the unexpected data is... but I guess if a user ever saw this we've got bigger problems anyway... (Hmm, that reminds me that I haven't thought of a solution to the problem that the background worker might try to SET client_encoding, which would be bad. Not sure what the best fix for that is, off-hand. I'd rather not prohibit ALL GUC changes in the background worker, as there's no good reason for such a draconian restriction.) Perhaps a new GucContext for background workers? Or maybe a new GucSource, though I'm not sure if that'd work and it seems pretty wrong anyway. BTW, perhaps the amount of discussion on internal parts is an indication this shouldn't be in contrib. I think it's a damn strong indication it shouldn't be in PGXN. :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 24/10/14 23:03, Robert Haas wrote: On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby wrote: On 10/24/14, 12:21 PM, Robert Haas wrote: - What should we call dsm_unkeep_mapping, if not that? Only option I can think of beyond unkeep would be dsm_(un)register_keep_mapping. Dunno that it's worth it. Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(), since it's arranging to keep it by unregistering it from the resource owner. And then we could call the new function dsm_register_mapping(). That has the appeal that "unregister" is a word, whereas "unkeep" isn't, but it's a little confusing otherwise, because the sense is reversed vs. the current naming. Or we could just leave dsm_keep_mapping() alone and call the new function dsm_register_mapping(). A little non-orthogonal, but I think it'd be OK. I don't like that too much, but I don't have better suggestion, if we went with one of these, I would prefer taking the route of renaming the dsm_keep_mapping to dsm_unregister_mapping. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Oct 24, 2014 at 1:07 PM, Robert Haas wrote: > The problem here isn't that I haven't read your emails. I have read > them all, including that one. Moreover, this isn't the first time > you've asserted that someone hasn't read one of your emails. So if > we're going to say what we each think would be helpful, then I think > it would be helpful if you stopped accusing the people who are taking > time to provide feedback on your work of having failed to read your > emails. It's possible that there may be instances where that problem > exists, but it beggars credulity to suppose that the repeated > unanimous consensus against some of your design decisions is entirely > an artifact of failure to pay attention. Okay. > The fact is, I don't feel obliged to respond to every one of your > emails, just as you don't respond to every one of mine. If you want > this patch to ever get committed, it's your job to push it forward - > not mine, not Simon's, and not Heikki's. Sometimes that means you > have to solve hard problems instead of just articulating what they > are. Okay. > You're conflating the user-visible syntax with the parse tree > representation in way that is utterly without foundation. I don't > have a position at this point on which parse-analysis representation > is preferable, but it's completely wrong-headed to say that you > "can't" make NEW.x produce the same parse-analysis output that your > CONFLICTING(x) syntax would have created. Sure, it might be harder, > but it's not that much harder, and it's definitely not an unsolvable > problem. I don't believe I did. The broader point is that the difficulty in making that work reflects the conceptually messiness, from user-visible aspects down. I can work with the difficulty, and I may even be able to live with the messiness, but I'm trying to bring the problems with it to a head sooner rather than later for good practical reasons. In all sincerity, my real concern is that you or the others will change your mind when I actually go and implement an OLD.* style syntax, and see the gory details. I regret it if to ask this is to ask too much of you, but FYI that's the thought process behind it. > I do acknowledge that there might be a better syntax out there than > NEW.x and OLD.x. I have not seen one proposed that I like better. > Feel free to propose something. But don't keep re-proposing something > that LOOKSLIKE(this) because nobody - other than perhaps you - likes > that. Okay. So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe that's an interesting precedent. During rewriting, this gets rewritten such that you end up with something that looks to the planner as if the original query included a constant (this actually comes from a catalog look-up for the column during rewriting). What if we spelled EXCLUDING/CONFLICTING as follows: INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val = EXCLUDED || 'this works' WHERE another_col != EXCLUDED; Then rewriting would figure these details out. From a design perspective, there'd need to be a few details worked out about how inference actually works - inferring *which* column the EXCLUDED expression actually referred to, but it seems doable, especially given the existing restrictions on the structure of the UPDATE. We're not rewriting from a SetToDefault to a constant, but a SetToDefault-like thing to a special Var (actually, the finished representation probably makes it to the execution stage with that Var representation filled in, unlike SetToDefault, but it's basically the same pattern). It solves my problem with dummy range table entries. Actually, *any* new kind of expression accomplishes this just as well. My concern here is more around not needing cute tricks with dummy RTEs than it is around being in favor of any particular expression-based syntax. What do you think of that? > And don't use the difficulty of parse analysis as a > justification for your proposed syntax, because, except in extreme > cases, there are going to be very few if any regular contributors to > this mailing list who will accept that as a justification for one > syntax over another. Syntax needs to be justified by being beautiful, > elegant, precedent-driven, orthogonal, and minimalist - not by whether > you might need an extra 25-75 lines of parse analysis code to make it > work. Well, that isn't the case here. I'd much rather code my way out of a disagreement with you. Unique index inference (i.e. the way we figure out *which* unique index to use) occurs during parse analysis. I think it would be inappropriate, and certainly inconvenient to do it during planning. >>> >>> You're wrong. The choice of which index to use is clearly wildly >>> inappropriately placed in the parse analysis phase, and if you think >>> that has any chance of ever being committed by anyone, then you are >>> presuming the existence of a committer who won't mind ignoring the >>
Re: [HACKERS] Question about RI checks
On Oct24, 2014, at 22:50 , Kevin Grittner wrote: > I need to spend some more time looking at it, and I have another > couple things in front of this on my personal TODO list, but I > think that if we had a row lock which was stronger than current > SELECT FOR UPDATE behavior, and the delete of a parent row (or > updated of its referenced key) read the children using it, this > would be solved. Essentially I'm thinking of something just like > FOR UPDATE except that a transaction which attempted a concurrent > UPDATE or DELETE of the row (before the locking transaction ended) > would error out with some form of serialization failure. I believe > this would be consistent with the behavior of SELECT FOR UPDATE in > Oracle, so it would allow a path for those migrating from Oracle to > maintain their current logic (with a slight change to the FOR > strength clause). Hm, I don't believe any form of traditional row-level lock on the child row can fix this. If you look at the second failure case (an updated isolation tester spec file which includes comments is attached), you see that it fails even if you define the FK constraint as CASCADE instead of RESTRICT. So even a full UPDATE or DELETE of the child rows doesn't help. But maybe I miss-understood what you proposed. best regards, Florian Pflug fk-consistency2.spec Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo fixes for pg_recvlogical documentation
On Fri, Oct 24, 2014 at 7:42 PM, Robert Haas wrote: > [rhaas pgsql]$ patch -p1 < ~/Downloads/20141023_pg_recvlogical_fixes.patch > patching file doc/src/sgml/ref/pg_recvlogical.sgml > Hunk #1 succeeded at 270 with fuzz 1 (offset 165 lines). > Hunk #2 FAILED at 282. > Hunk #3 FAILED at 295. > Hunk #4 FAILED at 308. > 3 out of 4 hunks FAILED -- saving rejects to file > doc/src/sgml/ref/pg_recvlogical.sgml.rej Sorry, I did not rebase much these days, there were conflicts with 52c1ae2... -- Michael *** a/doc/src/sgml/ref/pg_recvlogical.sgml --- b/doc/src/sgml/ref/pg_recvlogical.sgml *** *** 230,236 PostgreSQL documentation -d database !--dbname database The database to connect to. See the description of the actions for --- 230,236 -d database !--dbname=database The database to connect to. See the description of the actions for *** *** 243,249 PostgreSQL documentation -h hostname-or-ip !--host hostname-or-ip Specifies the host name of the machine on which the server is --- 243,249 -h hostname-or-ip !--host=hostname-or-ip Specifies the host name of the machine on which the server is *** *** 257,263 PostgreSQL documentation -p port !--port port Specifies the TCP port or local Unix domain socket file --- 257,263 -p port !--port=port Specifies the TCP port or local Unix domain socket file *** *** 270,276 PostgreSQL documentation -U user !--username user Username to connect as. Defaults to current operating system user --- 270,276 -U user !--username=user Username to connect as. Defaults to current operating system user -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On Fri, Oct 24, 2014 at 4:53 PM, Jim Nasby wrote: > The only case I can think of would be actually connecting to a remote > database; in that case would we even want something as raw as this? I > suspect not, in which case I don't see an issue. On the other hand, if we > ever think we might want to do that, we should probably at least stick a > version number field in there... > > But my suspicion is if we ever wanted to do something more with this then > we'd want some kind of text-based format that could be passed into a SQL > command (ie: SET ENVIRONMENT TO blah;) I mean, I don't think this is much different than what we're already doing to transfer variables from the postmaster to other backends in EXEC_BACKEND builds; see write_nondefault_variables(). It doesn't have a version number or anything either. I don't see a reason why this code needs to be held to a different standard; the purpose is fundamentally quite similar. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby wrote: > On 10/24/14, 12:21 PM, Robert Haas wrote: >> - What should we call dsm_unkeep_mapping, if not that? > > Only option I can think of beyond unkeep would be > dsm_(un)register_keep_mapping. Dunno that it's worth it. Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(), since it's arranging to keep it by unregistering it from the resource owner. And then we could call the new function dsm_register_mapping(). That has the appeal that "unregister" is a word, whereas "unkeep" isn't, but it's a little confusing otherwise, because the sense is reversed vs. the current naming. Or we could just leave dsm_keep_mapping() alone and call the new function dsm_register_mapping(). A little non-orthogonal, but I think it'd be OK. >> - Does anyone have a tangible suggestion for how to reduce the code >> duplication in patch #6? > > Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in > exec_simple that's not safe for bgwriter? I'm not seeing why we can't use > exec_simple. :/ There are some differences if you compare them closely. > BTW, it does occur to me that we could do something to combine > AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo(). Meh. > pg_background_result() > + dsm_unkeep_mapping(info->seg); > + > + /* Set up tuple-descriptor based on colum definition list. > */ > + if (get_call_result_type(fcinfo, NULL, &tupdesc) != > TYPEFUNC_COMPOSITE) > + ereport(ERROR, > Is there a reason we can't check the result type before unkeeping? Seems > silly to throw the results away just because someone flubbed a function > call... Hmm, yeah, I see no reason why we couldn't move that up higher in the function. It's probably a pretty common failure mode, too, so I can see the convenience factor there. > + default: > + elog(WARNING, "unknown message type: %c (%zu > bytes)", > +msg.data[0], nbytes); > It'd be useful to also provide DEBUG output with the message itself... I think that's more work than is justified. We'd have to hexdump it or something because it might not be valid in the client encoding, and it's a case that shouldn't happen anyway. (Hmm, that reminds me that I haven't thought of a solution to the problem that the background worker might try to SET client_encoding, which would be bad. Not sure what the best fix for that is, off-hand. I'd rather not prohibit ALL GUC changes in the background worker, as there's no good reason for such a draconian restriction.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 10/24/14, 3:26 PM, Robert Haas wrote: On Fri, Oct 24, 2014 at 3:30 PM, Jim Nasby wrote: On 10/24/14, 2:23 PM, Jim Nasby wrote: On the serialization structure itself, should we be worried about a mismatch between available GUCs on the sender vs the receiver? Presumably if the sender outputs a GUC that the receiver doesn't know about we'll get an error, but what if the sender didn't include something? Maybe not an issue today, but could this cause problems down the road if we wanted to use the serialized data some other way? But maybe I'm just being paranoid. :) I just realized there's a bigger problem there; this isn't portable against any changes to any of the binary elements. So I guess it's really a question of would we ever want this to function (as-is) cross-version. I think that would be pretty hard to make work, but I don't mind if someone else wants to try for some use case that they want to meet. My goal is to make parallel query work, so the data will just be getting transferred between two simultaneously-running children of the same postmaster. The only case I can think of would be actually connecting to a remote database; in that case would we even want something as raw as this? I suspect not, in which case I don't see an issue. On the other hand, if we ever think we might want to do that, we should probably at least stick a version number field in there... But my suspicion is if we ever wanted to do something more with this then we'd want some kind of text-based format that could be passed into a SQL command (ie: SET ENVIRONMENT TO blah;) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
Florian Pflug wrote: > On Oct24, 2014, at 20:24 , Robert Haas wrote: >> On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug wrote: What about doing one scan using SnapshotAny and then testing each returned row for visibility under both relevant snapshots? See whether there is any tuple for which they disagree. >>> >>> See my other mail - testing whether the snapshots agree isn't enough, >>> you'd have to check whether there could have been *any* snapshot taken >>> between the two which would see a different result. >> >> Oh, hmm. I had thought what I was proposing was strong enough to >> handle that case, but now I see that it isn't. However, I'm not >> entirely sure that it's the RI code's job to prevent such cases, or at >> least not when the transaction isolation level is less than >> serializable. > > Well, the responsibility was laid onto the RI code by the decision to > not do SIREAD locking for RI enforcement queries. Simply enabling SIREAD > locking without doing anything else is not a good solution, I think - > it will drive up the false-positive rate. And it would make workloads > consisting purely of serializable transactions pay the price for the > row-level locking done by the RI triggers, without getting anything in > return. Yeah, it would be far better to fix RI behavior if we can; otherwise those using serializable transactions to protect data integrity essentially pay the price for concurrency control around foreign keys twice -- once using blocking locks and again using non-blocking SIRead locks just to cover some odd corner cases. I need to spend some more time looking at it, and I have another couple things in front of this on my personal TODO list, but I think that if we had a row lock which was stronger than current SELECT FOR UPDATE behavior, and the delete of a parent row (or updated of its referenced key) read the children using it, this would be solved. Essentially I'm thinking of something just like FOR UPDATE except that a transaction which attempted a concurrent UPDATE or DELETE of the row (before the locking transaction ended) would error out with some form of serialization failure. I believe this would be consistent with the behavior of SELECT FOR UPDATE in Oracle, so it would allow a path for those migrating from Oracle to maintain their current logic (with a slight change to the FOR strength clause). >> Is there an argument that the anomaly that results is >> unacceptable at REPEATABLE READ? > > I'm not aware of any case that is clearly unacceptable for REPEATABLE > READ, but neither am I convinced that no such case exists. REPEATABLE READ > mode is hard because there doesn't seem to be any formal definition of > what we do and do not guarantee. The guarantees provided by the SQL > standard seem to be so much weaker than what we offer that they're not > really helpful :-( > > I believe the best way forward is to first find a solution for SERIALIZABLE > transactions, and then check if it can be applied to REPEATABLE READ > mode too. For SERIALIZABLE mode, it's at least clear what we're aiming > for -- offering true serializability. What I outline above would clearly work for both. Arguably we could just use FOR UPDATE in REPEATABLE READ and only use the new, stronger lock for SERIALIZABLE. In fact, that seems to me to be the appropriate course. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 10/24/14, 12:21 PM, Robert Haas wrote: - What should we call dsm_unkeep_mapping, if not that? Only option I can think of beyond unkeep would be dsm_(un)register_keep_mapping. Dunno that it's worth it. - Does anyone have a tangible suggestion for how to reduce the code duplication in patch #6? Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in exec_simple that's not safe for bgwriter? I'm not seeing why we can't use exec_simple. :/ BTW, it does occur to me that we could do something to combine AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo(). pg_background_result() + dsm_unkeep_mapping(info->seg); + + /* Set up tuple-descriptor based on colum definition list. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + ereport(ERROR, Is there a reason we can't check the result type before unkeeping? Seems silly to throw the results away just because someone flubbed a function call... + default: + elog(WARNING, "unknown message type: %c (%zu bytes)", +msg.data[0], nbytes); It'd be useful to also provide DEBUG output with the message itself... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
On Fri, Oct 24, 2014 at 2:58 PM, Florian Pflug wrote: > I believe the best way forward is to first find a solution for SERIALIZABLE > transactions, and then check if it can be applied to REPEATABLE READ > mode too. For SERIALIZABLE mode, it's at least clear what we're aiming > for -- offering true serializability. That sounds like a reasonable plan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On Fri, Oct 24, 2014 at 3:30 PM, Jim Nasby wrote: > On 10/24/14, 2:23 PM, Jim Nasby wrote: >> On the serialization structure itself, should we be worried about a >> mismatch between available GUCs on the sender vs the receiver? Presumably if >> the sender outputs a GUC that the receiver doesn't know about we'll get an >> error, but what if the sender didn't include something? Maybe not an issue >> today, but could this cause problems down the road if we wanted to use the >> serialized data some other way? But maybe I'm just being paranoid. :) > > I just realized there's a bigger problem there; this isn't portable against > any changes to any of the binary elements. > > So I guess it's really a question of would we ever want this to function > (as-is) cross-version. I think that would be pretty hard to make work, but I don't mind if someone else wants to try for some use case that they want to meet. My goal is to make parallel query work, so the data will just be getting transferred between two simultaneously-running children of the same postmaster. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On Fri, Oct 24, 2014 at 3:23 PM, Jim Nasby wrote: > Here's a review of patch 4. Thanks! > Perhaps it would be good to document the serialization format. I at least > would have found it helpful, especially when reading > estimate_variable_size(). I can take a stab at that if you'd rather not be > bothered. I find the existing code pretty clear: to my eyes, serialize_variable() really couldn't be much more transparent. But if you want to suggest something I'm all ears. > estimate_variable_size(): > + valsize = 5;/* max(strlen('true'), > strlen('false')) */ > ... > + if (abs(*conf->variable) < 1000) > + valsize = 3 + 1; > > If we're worried about size, perhaps we should use 1/0 for bool instead of > true/false? I guess we could. I'm not particularly worried about size, myself. We need all the rigamarole about estimating the amount of space needed because we can't allocate more space once we begin serializing. If the DSM where this data is getting stored turns out not to be big enough, we can't resize it. So we need to *know* how big the data is going to be, but I don't think we need to *care* very much. > On the serialization structure itself, should we be worried about a mismatch > between available GUCs on the sender vs the receiver? Presumably if the > sender outputs a GUC that the receiver doesn't know about we'll get an > error, but what if the sender didn't include something? Maybe not an issue > today, but could this cause problems down the road if we wanted to use the > serialized data some other way? But maybe I'm just being paranoid. :) I think the principal danger here is that there could be some loadable module which is present in one backend but not the other, or which (in its inimitable wisdom) chose to register a different set of GUCs or valid values for those GUCs in one backend than the other. That would indeed be sad. Given that our rules for registering GUCs resemble nothing so much as the wild west, I don't believe there's any ironclad way to defend against it, either. One thing we could do - not in this patch - is provide a mechanism that would cause a background worker to load every loadable module that's present in the launching worker. I'm a little reluctant to spend time on that now. If pg_background doesn't work with certain combinations of loadable modules and GUC settings... it's not a catastrophe. It might be more of an issue for parallel query proper, because expectations may be higher there, but if so it won't hurt to have some experience seeing whether things go wrong with this simple system, and in what manner. At least IMHO, it doesn't seem like a good idea to put fiddling with the edges of this ahead of other parallelism-related projects. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Oct 24, 2014 at 3:01 PM, Peter Geoghegan wrote: > This is the situation with unique index inference all over again. You > don't like a function-like expression. Okay. It would be a lot more > helpful if instead of just criticizing, you *also* looked at my > reasons for not wanting to go with something that would necessitate > adding a dummy range table entry [1]. The problem here isn't that I haven't read your emails. I have read them all, including that one. Moreover, this isn't the first time you've asserted that someone hasn't read one of your emails. So if we're going to say what we each think would be helpful, then I think it would be helpful if you stopped accusing the people who are taking time to provide feedback on your work of having failed to read your emails. It's possible that there may be instances where that problem exists, but it beggars credulity to suppose that the repeated unanimous consensus against some of your design decisions is entirely an artifact of failure to pay attention. The fact is, I don't feel obliged to respond to every one of your emails, just as you don't respond to every one of mine. If you want this patch to ever get committed, it's your job to push it forward - not mine, not Simon's, and not Heikki's. Sometimes that means you have to solve hard problems instead of just articulating what they are. > There are some very good > practical reasons for avoiding that. We only do that (the OLD.* and > NEW.* stuff) in the context of CREATE RULE and one or two other > things. We don't play any games like that during parse analysis of > ordinary optimizable statements, which is what this is. And those > dummy RTEs are always placeholders, AFAICT. Apart from those technical > reasons, the two situations just aren't comparable from a user-visible > perspective, but that isn't the main problem. > > I don't particularly expect you to come around to my view here. But it > would be nice to have the problems with dummy RTEs and so on > acknowledged, so it's understood that that is in itself a strange new > direction for parse analysis of ordinary optimizable statements. You're conflating the user-visible syntax with the parse tree representation in way that is utterly without foundation. I don't have a position at this point on which parse-analysis representation is preferable, but it's completely wrong-headed to say that you "can't" make NEW.x produce the same parse-analysis output that your CONFLICTING(x) syntax would have created. Sure, it might be harder, but it's not that much harder, and it's definitely not an unsolvable problem. I do acknowledge that there might be a better syntax out there than NEW.x and OLD.x. I have not seen one proposed that I like better. Feel free to propose something. But don't keep re-proposing something that LOOKSLIKE(this) because nobody - other than perhaps you - likes that. And don't use the difficulty of parse analysis as a justification for your proposed syntax, because, except in extreme cases, there are going to be very few if any regular contributors to this mailing list who will accept that as a justification for one syntax over another. Syntax needs to be justified by being beautiful, elegant, precedent-driven, orthogonal, and minimalist - not by whether you might need an extra 25-75 lines of parse analysis code to make it work. >>> Unique index inference (i.e. the way we figure out *which* unique >>> index to use) occurs during parse analysis. I think it would be >>> inappropriate, and certainly inconvenient to do it during planning. >> >> You're wrong. The choice of which index to use is clearly wildly >> inappropriately placed in the parse analysis phase, and if you think >> that has any chance of ever being committed by anyone, then you are >> presuming the existence of a committer who won't mind ignoring the >> almost-immediate revert request that would draw from, at the very >> least, Tom. > > Why? This has nothing to do with optimization. That is false. If there is more than one index that could be used, the system should select the best one. That is an optimization decision per se. Also, if a plan is saved - in the plancache, say, or in a view - the query can be re-planned if the index it depends on is dropped, but there's no way to do parse analysis. >> As much as I'd like to have this feature, your refusal to change >> anything except when asked at least three times each by about five >> different people makes this effort barely worth pursuing. You can say >> all you like that you're receptive to feedback, but multiple people >> here are telling you that they feel otherwise. > > I'm tired of your assertions about why I haven't done certain things. > It's not fair. I have incorporated a lot of feedback into V1.3 (and > previous versions), which isn't acknowledged. Also, I moved to the USA > this week, and things have been hectic. As I said in relation to > unique index inference, please consider that
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 10/24/14, 2:23 PM, Jim Nasby wrote: On the serialization structure itself, should we be worried about a mismatch between available GUCs on the sender vs the receiver? Presumably if the sender outputs a GUC that the receiver doesn't know about we'll get an error, but what if the sender didn't include something? Maybe not an issue today, but could this cause problems down the road if we wanted to use the serialized data some other way? But maybe I'm just being paranoid. :) I just realized there's a bigger problem there; this isn't portable against any changes to any of the binary elements. So I guess it's really a question of would we ever want this to function (as-is) cross-version. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 10/24/14, 12:21 PM, Robert Haas wrote: On Fri, Oct 24, 2014 at 1:13 PM, Jim Nasby wrote: It's a valid concern, but I think the way to handle it if needed is to limit the number of connections a user can open. Or perhaps another option would be to change the permissions on the related functions (do we check ACLs for internal functions?) I'm not sure dump-and-restore would preserve any properties of anything in pg_catalog. Anyway, I think we're getting a bit ahead of ourselves here. The questions I need answers to right now are: - What should we call dsm_unkeep_mapping, if not that? - Are there remaining complaints about patch #3? - How can I get somebody to review patch #4? Here's a review of patch 4. Perhaps it would be good to document the serialization format. I at least would have found it helpful, especially when reading estimate_variable_size(). I can take a stab at that if you'd rather not be bothered. estimate_variable_size(): + valsize = 5;/* max(strlen('true'), strlen('false')) */ ... + if (abs(*conf->variable) < 1000) + valsize = 3 + 1; If we're worried about size, perhaps we should use 1/0 for bool instead of true/false? On the serialization structure itself, should we be worried about a mismatch between available GUCs on the sender vs the receiver? Presumably if the sender outputs a GUC that the receiver doesn't know about we'll get an error, but what if the sender didn't include something? Maybe not an issue today, but could this cause problems down the road if we wanted to use the serialized data some other way? But maybe I'm just being paranoid. :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Oct 24, 2014 at 10:04 AM, Robert Haas wrote: > On Thu, Oct 23, 2014 at 9:43 PM, Peter Geoghegan wrote: >> * CONFLICTING() is renamed to EXCLUDED(). I know that some people >> wanted me to go a different way with this. I think that there are very >> good practical reasons not to [1], as well as good reasons related to >> design, but I still accept that CONFLICTING() isn't very descriptive. >> This spelling seems a lot better. > > Specifically, "some people" included at least three committers and at > least one other community member no less prominent than yourself, or > in other words, every single person who bothered to comment. You can > think whatever you like; the chances of it being committed that way > are zero. This is the situation with unique index inference all over again. You don't like a function-like expression. Okay. It would be a lot more helpful if instead of just criticizing, you *also* looked at my reasons for not wanting to go with something that would necessitate adding a dummy range table entry [1]. There are some very good practical reasons for avoiding that. We only do that (the OLD.* and NEW.* stuff) in the context of CREATE RULE and one or two other things. We don't play any games like that during parse analysis of ordinary optimizable statements, which is what this is. And those dummy RTEs are always placeholders, AFAICT. Apart from those technical reasons, the two situations just aren't comparable from a user-visible perspective, but that isn't the main problem. I don't particularly expect you to come around to my view here. But it would be nice to have the problems with dummy RTEs and so on acknowledged, so it's understood that that is in itself a strange new direction for parse analysis of ordinary optimizable statements. >> Unique index inference (i.e. the way we figure out *which* unique >> index to use) occurs during parse analysis. I think it would be >> inappropriate, and certainly inconvenient to do it during planning. > > You're wrong. The choice of which index to use is clearly wildly > inappropriately placed in the parse analysis phase, and if you think > that has any chance of ever being committed by anyone, then you are > presuming the existence of a committer who won't mind ignoring the > almost-immediate revert request that would draw from, at the very > least, Tom. Why? This has nothing to do with optimization. You either have an appropriate unique index available, in which case you can proceed (and the cost associated with value locking the index and so on is an inherent cost, kind of like inserting into an index in general). Or you don't, and you can't (you get an error, not a sequential scan). That's what I'd call a semantic dependence on the index. Yes, that's kind of unusual, but it's the reality. I can have the code do unique index inference during planning instead if you insist, but I think that that isn't useful. If you want me to do things that way, please explain why. Otherwise, even if I simply acquiesce to your wishes, I may end up missing the point. > As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE > syntax proposed upthread was the best of any mentioned thus far. The > MERGE-based syntaxes proposed upthread were crazily verbose for no > discernable benefit. I don't think anyone is too insistent on pursuing that. I think that the INSERT .. ON CONFLICT UPDATE syntax has more or less emerged as the favorite. > As much as I'd like to have this feature, your refusal to change > anything except when asked at least three times each by about five > different people makes this effort barely worth pursuing. You can say > all you like that you're receptive to feedback, but multiple people > here are telling you that they feel otherwise. I'm tired of your assertions about why I haven't done certain things. It's not fair. I have incorporated a lot of feedback into V1.3 (and previous versions), which isn't acknowledged. Also, I moved to the USA this week, and things have been hectic. As I said in relation to unique index inference, please consider that I haven't immediately complied with that feedback because there are genuine technical obstacles that at the very least make it more difficult than it appears. And, I'm not necessarily able to comply quickly because of time constraints. [1] http://www.postgresql.org/message-id/cam3swzqhixqi1ost14v7spjqrupmcnrtbxje846-eb1bc+9...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
On Oct24, 2014, at 20:24 , Robert Haas wrote: > On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug wrote: >>> What about doing one scan using SnapshotAny and then testing each >>> returned row for visibility under both relevant snapshots? See >>> whether there is any tuple for which they disagree. >> >> See my other mail - testing whether the snapshots agree isn't enough, >> you'd have to check whether there could have been *any* snapshot taken >> between the two which would see a different result. > > Oh, hmm. I had thought what I was proposing was strong enough to > handle that case, but now I see that it isn't. However, I'm not > entirely sure that it's the RI code's job to prevent such cases, or at > least not when the transaction isolation level is less than > serializable. Well, the responsibility was laid onto the RI code by the decision to not do SIREAD locking for RI enforcement queries. Simply enabling SIREAD locking without doing anything else is not a good solution, I think - it will drive up the false-positive rate. And it would make workloads consisting purely of serializable transactions pay the price for the row-level locking done by the RI triggers, without getting anything in return. > Is there an argument that the anomaly that results is > unacceptable at REPEATABLE READ? I'm not aware of any case that is clearly unacceptable for REPEATABLE READ, but neither am I convinced that no such case exists. REPEATABLE READ mode is hard because there doesn't seem to be any formal definition of what we do and do not guarantee. The guarantees provided by the SQL standard seem to be so much weaker than what we offer that they're not really helpful :-( I believe the best way forward is to first find a solution for SERIALIZABLE transactions, and then check if it can be applied to REPEATABLE READ mode too. For SERIALIZABLE mode, it's at least clear what we're aiming for -- offering true serializability. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function array_agg(array)
Hi Ali I checked a code. I am thinking so code organization is not good. accumArrayResult is too long now. makeMdArrayResult will not work, when arrays was joined (it is not consistent now). I don't like a usage of state->is_array_accum in array_userfunc.c -- it is signal of wrong wrapping. next question: there is function array_append(anyarray, anyelement). Isn't time to define array_append(anyarray, anyarray) now? Regards Pavel 2014-10-24 15:05 GMT+02:00 Pavel Stehule : > > > 2014-10-24 13:58 GMT+02:00 Pavel Stehule : > >> >> >> 2014-10-24 11:43 GMT+02:00 Ali Akbar : >> >>> >>> 2014-10-24 16:26 GMT+07:00 Pavel Stehule : >>> Hi some in last patch is wrong, I cannot to compile it: arrayfuncs.c: In function ‘accumArrayResult’: arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->alen = 64; /* arbitrary starting array size */ ^ arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named ‘dvalues’ astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); ^ arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); ^ arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’ astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); ^ arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); ^ arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’ astate->nelems = 0; ^ arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named ‘nelems’ if (astate->nelems >= astate->alen) ^ arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’ if (astate->nelems >= astate->alen) ^ arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->alen *= 2; >>> >>> Sorry, correct patch attached. >>> >>> This patch is in patience format (git --patience ..). In previous >>> patches, i use context format (git --patience ... | filterdiff >>> --format=context), but it turns out that some modification is lost. >>> >> >> last version is compileable, but some is still broken >> >> postgres=# select array_agg(array[i, i+1, i-1]) >> from generate_series(1,2) a(i); >> ERROR: could not find array type for data type integer[] >> > > I am sorry, it works - I had a problem with broken database > > I fixed small issue in regress tests and I enhanced tests for varlena > types and null values. > > Regards > > Pavel > > >> >> but array(subselect) works >> >> postgres=# select array(select a from xx); >>array >> --- >> {{1,2,3},{1,2,3}} >> (1 row) >> >> Regards >> >> Pavel >> >> >> >>> >>> -- >>> Ali Akbar >>> >> >> >
Re: [HACKERS] Question about RI checks
On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug wrote: >> What about doing one scan using SnapshotAny and then testing each >> returned row for visibility under both relevant snapshots? See >> whether there is any tuple for which they disagree. > > See my other mail - testing whether the snapshots agree isn't enough, > you'd have to check whether there could have been *any* snapshot taken > between the two which would see a different result. Oh, hmm. I had thought what I was proposing was strong enough to handle that case, but now I see that it isn't. However, I'm not entirely sure that it's the RI code's job to prevent such cases, or at least not when the transaction isolation level is less than serializable. Is there an argument that the anomaly that results is unacceptable at REPEATABLE READ? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How ugly would this be? (ALTER DATABASE)
On Fri, Oct 24, 2014 at 2:06 PM, Joshua D. Drake wrote: > One of the things we run into quite a bit is customers who are using > multiple databases when they should be using multiple schemas. I am sure > other consultants run into this as well. This gets even more difficult as > uptime requirements have become all but 100%. So my question is, what would > this take? > > ALTER DATABASE foo LOCATION DATABASE bar SCHEMA baz? > > Where if we execute that command, database foo would move to schema baz > within database bar? > > I am fully aware of what it takes on the client side but structurally within > postgres what would it take? Is it even reasonable? What if the database contains more than one schema? You could perhaps try to create a command that would move a schema between two databases in the same cluster. It's fraught with practical difficulties because a single backend can't be connected to both databases at the same time, so how exactly do you make the required catalog changes all in a single transaction? But if you imagine that you have an infinite pool of top-notch PostgreSQL talent with unbounded time to work on this problem and no other, I bet somebody could engineer a solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
On Oct24, 2014, at 19:32 , Robert Haas wrote: > On Fri, Oct 24, 2014 at 1:28 PM, Florian Pflug wrote: >> The only other option I see would be so teach the executor to check >> whether *any* snapshot between the transaction's snapshot and a current >> snapshot would see a different set of rows. Simply checking whether both >> the current snapshot and the transaction's snapshot see the same set isn't >> sufficient, per the counter-example in my other mail :-( > > What about doing one scan using SnapshotAny and then testing each > returned row for visibility under both relevant snapshots? See > whether there is any tuple for which they disagree. See my other mail - testing whether the snapshots agree isn't enough, you'd have to check whether there could have been *any* snapshot taken between the two which would see a different result. Or at least what I currently think - I don't yet have an actual proof at hand that doing this always preserves serializability. But it seems plausible that it does, because it ensures that there is a rather large time frame during which the enforcement query can be placed in a serial schedule without changing its result. This applies only to SERIALIZABLE transactions, though. For REPEATABLE READ, it might be that checking whether the two snapshots agree is sufficient. So a third way forward would be to not skip the SSI logic for RI enforcement queries. Though I fear that, due to the imprecise nature of SIREAD locks, doing so would drive the false positive rate way up for workloads involving lots of parent and child row modifications. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How ugly would this be? (ALTER DATABASE)
Hello, One of the things we run into quite a bit is customers who are using multiple databases when they should be using multiple schemas. I am sure other consultants run into this as well. This gets even more difficult as uptime requirements have become all but 100%. So my question is, what would this take? ALTER DATABASE foo LOCATION DATABASE bar SCHEMA baz? Where if we execute that command, database foo would move to schema baz within database bar? I am fully aware of what it takes on the client side but structurally within postgres what would it take? Is it even reasonable? JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, @cmdpromptinc "If we send our children to Caesar for their education, we should not be surprised when they come back as Romans." -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
On Fri, Oct 24, 2014 at 9:05 AM, Heikki Linnakangas wrote: > So, this is what I came up with for master. Does anyone see a problem with > it? In the proposed commit message, you mis-spelled "significant" as "signicant". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views don't show up in information_schema
On Fri, Oct 17, 2014 at 8:10 PM, Stephen Frost wrote: > * Peter Eisentraut (pete...@gmx.net) wrote: >> On 10/16/14 9:45 AM, Stephen Frost wrote: >> > Alright, coming back to this, I have to ask- how are matviews different >> > from views from the SQL standard's perspective? I tried looking through >> > the standard to figure it out (and I admit that I probably missed >> > something), but the only thing appears to be a statement in the standard >> > that (paraphrased) "functions are run with the view is queried" and that >> > strikes me as a relatively minor point.. >> >> To me, the main criterion is that you cannot DROP VIEW a materialized view. > > That is an entirely correctable situation. We don't require 'DROP > UNLOGGED TABLE'. I think that's an inapposite comparison. The fact that a table is unlogged is merely a property of the table; it does not change the fact that it is a table. A materialized view, on the other hand, is different kind of object from a view. This manifests itself the fact that it's represented by a different relkind; and that different syntax is used not only for DROP but also for COMMENT, ALTER VIEW, SECURITY LABEL, and ALTER EXTENSION .. ADD/DROP; and that the set of supported operations on a materialized view is different from a regular view (and will probably be more different in the future). If we really want to change this, we can't just change DROP VIEW; we need to change all of the places in a consistent fashion, and we probably have to continue to support the old syntax so that we don't break existing dumps. But I think it's the wrong thing anyway, because it presumes that, when Kevin chose to make materialized views a different relkind and a different object type, rather than just a property of an object, he made the wrong call, and I don't agree with that. I think he got it exactly right. A materialized view is really much more like a table than a view: it has storage and can be vacuumed, clustered, analyzed, and so on. That's far more significant IMV than the difference between a table and unlogged table. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo fixes for pg_recvlogical documentation
On Thu, Oct 23, 2014 at 10:00 AM, Michael Paquier wrote: > pg_recvlogical is missing some "=" signs for a couple of option names where > double-dash is used, like this one: > --username user > should be that: > --username=user > > Attached is a patch correcting that. For reasons that are not clear to me, I cannot apply this patch. [rhaas pgsql]$ patch -p1 < ~/Downloads/20141023_pg_recvlogical_fixes.patch patching file doc/src/sgml/ref/pg_recvlogical.sgml Hunk #1 succeeded at 270 with fuzz 1 (offset 165 lines). Hunk #2 FAILED at 282. Hunk #3 FAILED at 295. Hunk #4 FAILED at 308. 3 out of 4 hunks FAILED -- saving rejects to file doc/src/sgml/ref/pg_recvlogical.sgml.rej -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
On Fri, Oct 24, 2014 at 1:28 PM, Florian Pflug wrote: > The only other option I see would be so teach the executor to check > whether *any* snapshot between the transaction's snapshot and a current > snapshot would see a different set of rows. Simply checking whether both > the current snapshot and the transaction's snapshot see the same set isn't > sufficient, per the counter-example in my other mail :-( What about doing one scan using SnapshotAny and then testing each returned row for visibility under both relevant snapshots? See whether there is any tuple for which they disagree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Oct 24, 2014 at 1:18 PM, Josh Berkus wrote: > On 10/24/2014 10:04 AM, Robert Haas wrote: >> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE >> syntax proposed upthread was the best of any mentioned thus far. The >> MERGE-based syntaxes proposed upthread were crazily verbose for no >> discernable benefit. > > For those of us who haven't followed every post in this thread, is there > somewhere I can see the proposed syntax? There are a couple of different proposals but this should give you a feeling for where we're at: http://www.postgresql.org/message-id/CA+TgmoZN=2ajki1n4jz5bkmyi8r_cpudw+dtoppmtelvmso...@mail.gmail.com The part I like is the idea of making UPSERT look like an INSERT statement with an additional clause that says how a unique violation should be handled. The part nobody except Peter likes is using functional notation like CONFLICTING() or EXCLUDED() to pull in values from the tuple causing the unique violation. And there are some other areas of disagreement about particular keywords and so on. But I personally like that general style much more than the alternatives derived from MERGE. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On Fri, Oct 24, 2014 at 1:13 PM, Jim Nasby wrote: > It's a valid concern, but I think the way to handle it if needed is to limit > the number of connections a user can open. Or perhaps another option would > be to change the permissions on the related functions (do we check ACLs for > internal functions?) I'm not sure dump-and-restore would preserve any properties of anything in pg_catalog. Anyway, I think we're getting a bit ahead of ourselves here. The questions I need answers to right now are: - What should we call dsm_unkeep_mapping, if not that? - Are there remaining complaints about patch #3? - How can I get somebody to review patch #4? - Does anyone have a tangible suggestion for how to reduce the code duplication in patch #6? The question of where pg_background should ultimately live does matter, but the answer will be "the -hackers mailing list archives" unless we can get agreement on the prerequisite patches. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
On Oct24, 2014, at 18:42 , Robert Haas wrote: > I don't think you can count on being able to figure out all of the > recent lockers by looking at xmax; it can get overwritten. For > example, suppose transaction A locks the row and then commits. Then > transaction B comes along and again locks the row and commits. My > understanding is that B won't create a multi-xact since there's just > one locker; it'll just stomp on the previous xmax. > > Now, you could change that, but I suspect it would have pretty drastic > implications on systems that have both foreign keys and long-running > transactions. Yes, we'd have to create multi-xids in some cases were we don't do that today. How big the effect would be, I honestly don't know. I guess it depends on how costly multi-xid creation is, compared to updating the parent row (to change its xmax) and updating or deleting child rows. My hope would be that it's cheap compared to that, but maybe that's not true, especially since multi-xids are xlog'ed nowadays. The only other option I see would be so teach the executor to check whether *any* snapshot between the transaction's snapshot and a current snapshot would see a different set of rows. Simply checking whether both the current snapshot and the transaction's snapshot see the same set isn't sufficient, per the counter-example in my other mail :-( An advantage of the latter is that I'd but the burden on the session that attempts to remove a parent row, instead of on the sessions which add or remove children. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On 10/24/2014 10:04 AM, Robert Haas wrote: > As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE > syntax proposed upthread was the best of any mentioned thus far. The > MERGE-based syntaxes proposed upthread were crazily verbose for no > discernable benefit. For those of us who haven't followed every post in this thread, is there somewhere I can see the proposed syntax? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On 10/22/14, 7:49 PM, Joshua D. Drake wrote: I lean toward a push into core because: +1 3. I am not sure I buy into the super-user argument. Just because the functionality is there, doesn't mean it has to be used. It's a valid concern, but I think the way to handle it if needed is to limit the number of connections a user can open. Or perhaps another option would be to change the permissions on the related functions (do we check ACLs for internal functions?) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Oct 23, 2014 at 9:43 PM, Peter Geoghegan wrote: > * CONFLICTING() is renamed to EXCLUDED(). I know that some people > wanted me to go a different way with this. I think that there are very > good practical reasons not to [1], as well as good reasons related to > design, but I still accept that CONFLICTING() isn't very descriptive. > This spelling seems a lot better. Specifically, "some people" included at least three committers and at least one other community member no less prominent than yourself, or in other words, every single person who bothered to comment. You can think whatever you like; the chances of it being committed that way are zero. > Unique index inference (i.e. the way we figure out *which* unique > index to use) occurs during parse analysis. I think it would be > inappropriate, and certainly inconvenient to do it during planning. You're wrong. The choice of which index to use is clearly wildly inappropriately placed in the parse analysis phase, and if you think that has any chance of ever being committed by anyone, then you are presuming the existence of a committer who won't mind ignoring the almost-immediate revert request that would draw from, at the very least, Tom. As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE syntax proposed upthread was the best of any mentioned thus far. The MERGE-based syntaxes proposed upthread were crazily verbose for no discernable benefit. As much as I'd like to have this feature, your refusal to change anything except when asked at least three times each by about five different people makes this effort barely worth pursuing. You can say all you like that you're receptive to feedback, but multiple people here are telling you that they feel otherwise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about RI checks
On Thu, Oct 23, 2014 at 2:41 PM, Florian Pflug wrote: > The only reason we need the crosscheck snapshot to do that is because > children may have been added (and the change committed) *after* the > transaction which removed the parent has taken its snapshot, but *before* > that transaction locks the parent row. > > My proposal is to instead extend the locking protocol to prevent that. > Essentially, we have to raise a serialization error whenever > > 1) We attempt to exclusively lock a row (this includes updating or deleting > it), and > > 2) somebody else did hold a share lock on that row recently, and > > 3) That somebody is invisible to us according to our snapshot. > > My initial attempt to do that failed, because we used to have very little > means of storing the locking history - the only source of information was > the xmax field, so any update of a tuple removed information about previous > lock holders - even if that update was later aborted. I pondered using > multi-xids for this, but at the time I deemed that too risky - back at the > time, they had a few wraparound issues and the like which were OK for share > locks, but not for what I intended. > > But now that we have KEY SHARE locks, the situation changes. We now rely on > multi-xids to a much greater extent, and AFAIK multi-xid wraparound is now > correctly dealt with. We also already ensure that the information contained > in multi-xids are preserved across tuple upgrades (otherwise, updating a row > on which someone holds a KEY SHARE lock would be broken). > > So all that is missing, I think, is > > 1) To make sure we only remove a multi-xid if none of the xids are invisible > to any snapshot (i.e. lie before GlobalXmin or something like that). > > 2) When we acquire a lock (either explicitly or implicitly by doing an > UPDATE or DELETE) check if all previous committed lock holders are > visible according to our snapshot, and raise a serialization error > if not. I don't think you can count on being able to figure out all of the recent lockers by looking at xmax; it can get overwritten. For example, suppose transaction A locks the row and then commits. Then transaction B comes along and again locks the row and commits. My understanding is that B won't create a multi-xact since there's just one locker; it'll just stomp on the previous xmax. Now, you could change that, but I suspect it would have pretty drastic implications on systems that have both foreign keys and long-running transactions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On 10/24/2014 11:36 AM, Michael Paquier wrote: On Thu, Sep 25, 2014 at 12:38 PM, Heikki Linnakangas < hlinnakan...@vmware.com> wrote: On 09/24/2014 05:48 PM, Abhijit Menon-Sen wrote: Updated patches attached. Thanks, applied some version of these. This set of patches has been applied as b0d81ad but patch 0001 did not make it in, so pg_controldata is not yet able to accept -D: $ pg_controldata -D /foo/path pg_controldata: could not open file "-D/global/pg_control" for reading: No such file or directory $ pg_controldata /foo/path pg_controldata: could not open file "/foo/path/global/pg_control" for reading: No such file or directory Argh, looks like I forgot the actual code changes required. I just noticed that pg_controldata and pg_resetxlog don't check for extra arguments: $ pg_resetxlog data fds sdf sdf Transaction log reset Fixed that too. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing lock strength of adding foreign keys
On Wed, Oct 22, 2014 at 3:06 AM, Andreas Karlsson wrote: > I have been thinking about why we need to grab an AccessExclusiveLock on the > table with the PK when we add a foreign key. Adding new tables with foreign > keys to old ones is common so it would be really nice if the lock strength > could be reduced. > > A comment in the code claims that we need to lock so no rows are deleted > under us and that adding a trigger will lock in AccessExclusive anyway. But > with MVCC catalog access and the removal of SnapshotNow I do not see why > adding a trigger would require an exclusive lock. Just locking for data > changes should be enough. The use of MVCC catalog access doesn't necessarily mean that adding a trigger doesn't require an AccessExclusive lock. Those changes - if I dare to say so myself - solved a complex and important problem, but only one of many problems in this area, and people seem prone to thinking that they solved more problems than they in fact did. I think instead of focusing on foreign keys, we should rewind a bit and think about the locking level required to add a trigger. If we figure something out there, then we can consider how it affects foreign keys. I went looking for previous discussions of remaining hazards and found these postings: http://www.postgresql.org/message-id/ca+tgmoy4glsxzk0tao29-ljtcuj0sl1xwcwq51xb-hfysgi...@mail.gmail.com http://www.postgresql.org/message-id/20893.1393892...@sss.pgh.pa.us http://www.postgresql.org/message-id/20140306224340.ga3551...@tornado.leadboat.com As far as triggers are concerned, the issue of skew between the transaction snapshot and what the ruleutils.c snapshots do seems to be the principal issue. Commit e5550d5fec66aa74caad1f79b79826ec64898688 changed pg_get_constraintdef() to use an MVCC snapshot rather than a current MVCC snapshot; if that change is safe, I am not aware of any reason why we couldn't change pg_get_triggerdef() similarly. Barring further hazards I haven't thought of, I would expect that we could add a trigger to a relation with only ShareRowExclusiveLock. Anything less than ShareRowExclusiveLock would open up strange timing races around the firing of triggers by transactions writing the table: they might or might not notice that a trigger had been added before end-of-transaction, depending on the timing of cache flushes, which certainly seems no good. But even RowExclusiveLock seems like a large improvement over AccessExclusiveLock. When a constraint trigger - which is used to implement a foreign key - is added, there are actually TWO tables involved: the table upon which the trigger will actually fire, and some other table which is mentioned in passing in the trigger definition. It's possible that the locking requirements for the secondary table are weaker since I don't think the presence of the trigger actually affects runtime behavior there. However, there's probably little point in try to weaken the lock to less than the level required for the main table because a foreign key involves adding referential integrity triggers to both tables. So I tentatively propose (and with due regard for the possibility others may see dangers that I've missed) that a reasonable goal would be to lower the lock strength required for both CREATE TRIGGER and ADD FOREIGN KEY from AccessExclusiveLock to ShareRowExclusiveLock, allowing concurrent SELECT and SELECT FOR SHARE against the tables, but not any actual write operations. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and ranges
which would also be nice to fix.. Of course, agree. With rbtree usage instead of tidbitmap hash and semantic knowledge about operators in GIN... -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deferring some AtStart* allocations?
On Fri, Oct 24, 2014 at 10:10 AM, Andres Freund wrote: > What I was thinking was that you'd append the messages to the layer one > level deeper than the parent. Then we'd missed the invalidations when > rolling back the intermediate xact. But since I was quite mistaken > above, this isn't a problem :) So, you happy with the patch now? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uninitialized values in revised prepared xact code
On 10/24/2014 05:40 PM, Andres Freund wrote: On 2014-10-24 17:13:49 +0300, Heikki Linnakangas wrote: Yeah. The padding bytes in TwoPhaseFileHeader were not initialized. That's simple enough to fix, but when I run valgrind, I get a lot whole bunch of similar messages. A few are from pgstat: the padding bytes in the pgstat messages are not initialized. One comes from write_relcache_init_file(); again I believe it's padding bytes being uninitialized (in FormData_pg_attribute). And one from the XLogInsert from heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so forth.. Is it worthwhile to hunt down all of these? If there aren't many more than these, it probably is worth it, but I fear this might be an endless effort. Have we been clean of these warnings at any point in the past? Did you use the valgrind suppression file in src/tools? It suppresses some "known harmless" cases. Ah, I did not. With the file, the original warning that started this thread is gone; you added a suppression for it in commit 9a0a12f683235d3e10b873baba974f6414297a7e. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uninitialized values in revised prepared xact code
On 2014-10-24 17:13:49 +0300, Heikki Linnakangas wrote: > Yeah. The padding bytes in TwoPhaseFileHeader were not initialized. > > That's simple enough to fix, but when I run valgrind, I get a lot whole > bunch of similar messages. A few are from pgstat: the padding bytes in the > pgstat messages are not initialized. One comes from > write_relcache_init_file(); again I believe it's padding bytes being > uninitialized (in FormData_pg_attribute). And one from the XLogInsert from > heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so > forth.. Is it worthwhile to hunt down all of these? If there aren't many > more than these, it probably is worth it, but I fear this might be an > endless effort. Have we been clean of these warnings at any point in the > past? Did you use the valgrind suppression file in src/tools? It suppresses some "known harmless" cases. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X
Michael Paquier writes: > On Fri, Oct 24, 2014 at 3:39 PM, Tom Lane wrote: >> Peter, Dave: maybe you have tweaked things to keep listen_addresses >> empty and rely only on Unix-socket connections? > Should be so. The target of this feature is development on OSX, right? > And most of the time development would be done only on the local > machine, machine being most of the time a laptop. So instead of adding > an optional step in configure to enforce the creation of a > certificate, why not simply encourage people to use listen_addresses = > '' on OSX by documenting it? Even when working on replication or > related things on a local machine, it is possible to simply pass the > socket directory... Some clients (eg JDBC) don't support Unix-socket connections AFAIK, so this seems like a rather restricted solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback
On 10/24/2014 01:24 PM, furu...@pm.nttdata.co.jp wrote: Sorry, I'm going around in the circle. But I'd like to say again, I don't think this is good idea. It prevents asynchronous pg_receivexlog from fsyncing WAL data and sending feedbacks more frequently at all. They are useful, for example, when we want to monitor the write location of asynchronous pg_receivexlog in almost realtime. But if we adopt the idea, since feedback cannot be sent soon in async mode, pg_stat_replication always returns the not-up-to-date location. Why not send a message every 10 seconds when its not sync rep? Or even after every write(). It's a tiny amount of network traffic anyway. I understand that send feedback message frequently will keep pg_stat_replication up-to-date state. Are there really no needs who wants to fsync even in async mode ? I think the people who dislike Data lost will like that idea. The OS will not sit on the written but not fsync'd data forever, it will get flushed to disk in a few seconds even without the fsync. It's just that there is no guarantee on when it will hit the disk, but there are no strong guarantees in asynchronous replication anyway. Nevertheless in sync or async, returning feedback and executing fsync() same as like walreceiver is such a problem? Probably would be OK. It would increase the I/O a lot, thanks to a lot of small writes and fsyncs, which might be surprising for a tool like pg_receivexlog. One idea is to change the default behavior to be like walreceiver, and fsync() after every write. But provide an option to get the old behavior, "--no-fsync". - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uninitialized values in revised prepared xact code
On 10/11/2014 10:20 PM, Bruce Momjian wrote: Uh, was this fixed. I see a cleanup commit for this C file, but this report is from June: commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0 Author: Heikki Linnakangas Date: Fri May 16 09:47:50 2014 +0300 Initialize tsId and dbId fields in WAL record of COMMIT PREPARED. Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed that prepared transaction commits reuse that struct. Fix that. Because those fields were left unitialized, replaying a commit prepared WAL record in a hot standby node would fail to remove the relcache init file. That can lead to "could not open file" errors on the standby. Relcache init file only needs to be removed when a system table/index is rewritten in the transaction using two phase commit, so that should be rare in practice. In HEAD, the incorrect dbId/tsId values are also used for filtering in logical replication code, causing the transaction to always be filtered out. Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was introduced. No, that was a different issue. (more below) On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote: Hi, I've just rerun valgrind for the first time in a while and saw the following splat. My guess is it exists since bb38fb0d43c, but that's blindly guessing: ==2049== Use of uninitialised value of size 8 ==2049==at 0x4FE66D: EndPrepare (twophase.c:1063) ==2049==by 0x4F231B: PrepareTransaction (xact.c:2217) ==2049==by 0x4F2A38: CommitTransactionCommand (xact.c:2676) ==2049==by 0x79013E: finish_xact_command (postgres.c:2408) ==2049==by 0x78DE97: exec_simple_query (postgres.c:1062) ==2049==by 0x791FDD: PostgresMain (postgres.c:4010) ==2049==by 0x71B13B: BackendRun (postmaster.c:4113) ==2049==by 0x71A86D: BackendStartup (postmaster.c:3787) ==2049==by 0x71714C: ServerLoop (postmaster.c:1566) ==2049==by 0x716804: PostmasterMain (postmaster.c:1219) ==2049==by 0x679405: main (main.c:219) ==2049== Uninitialised value was created by a stack allocation ==2049==at 0x4FE16C: StartPrepare (twophase.c:942) ==2049== ==2049== Syscall param write(buf) points to uninitialised byte(s) ==2049==at 0x5C69640: __write_nocancel (syscall-template.S:81) ==2049==by 0x4FE6AE: EndPrepare (twophase.c:1064) ==2049==by 0x4F231B: PrepareTransaction (xact.c:2217) ==2049==by 0x4F2A38: CommitTransactionCommand (xact.c:2676) ==2049==by 0x79013E: finish_xact_command (postgres.c:2408) ==2049==by 0x78DE97: exec_simple_query (postgres.c:1062) ==2049==by 0x791FDD: PostgresMain (postgres.c:4010) ==2049==by 0x71B13B: BackendRun (postmaster.c:4113) ==2049==by 0x71A86D: BackendStartup (postmaster.c:3787) ==2049==by 0x71714C: ServerLoop (postmaster.c:1566) ==2049==by 0x716804: PostmasterMain (postmaster.c:1219) ==2049==by 0x679405: main (main.c:219) ==2049== Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd ==2049==at 0x4C27B8F: malloc (vg_replace_malloc.c:298) ==2049==by 0x8E766E: AllocSetAlloc (aset.c:853) ==2049==by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627) ==2049==by 0x8A54D3: AtStart_Inval (inval.c:704) ==2049==by 0x4F1DFC: StartTransaction (xact.c:1841) ==2049==by 0x4F28D1: StartTransactionCommand (xact.c:2529) ==2049==by 0x7900A7: start_xact_command (postgres.c:2383) ==2049==by 0x78DAF4: exec_simple_query (postgres.c:860) ==2049==by 0x791FDD: PostgresMain (postgres.c:4010) ==2049==by 0x71B13B: BackendRun (postmaster.c:4113) ==2049==by 0x71A86D: BackendStartup (postmaster.c:3787) ==2049==by 0x71714C: ServerLoop (postmaster.c:1566) ==2049== Uninitialised value was created by a stack allocation ==2049==at 0x4FE16C: StartPrepare (twophase.c:942) It's probably just padding - twophase.c:1063 is the CRC32 computation of the record data. Yeah. The padding bytes in TwoPhaseFileHeader were not initialized. That's simple enough to fix, but when I run valgrind, I get a lot whole bunch of similar messages. A few are from pgstat: the padding bytes in the pgstat messages are not initialized. One comes from write_relcache_init_file(); again I believe it's padding bytes being uninitialized (in FormData_pg_attribute). And one from the XLogInsert from heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so forth.. Is it worthwhile to hunt down all of these? If there aren't many more than these, it probably is worth it, but I fear this might be an endless effort. Have we been clean of these warnings at any point in the past? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deferring some AtStart* allocations?
On 2014-10-24 09:45:59 -0400, Robert Haas wrote: > On Fri, Oct 24, 2014 at 9:17 AM, Andres Freund wrote: > >> If > >> that subtransaction abouts, AtEOSubXact_Inval() gets called again, > >> sees that it has messages (that it inherited from the innermost > >> subtransaction), and takes the exact same code-path that it would have > >> pre-patch. > > > > Consider what happens if the innermost transaction commits and the > > second innermost one rolls back. AtEOSubXact_Inval() won't do anything > > because it doesn't have any entries itself. > > This is the part I don't understand. After the innermost transaction > commits, it *does* have entries itself. Sure, otherwise there'd be no problem. > This whole block is basically just an optimization: > + if (myInfo->parent == NULL || myInfo->parent->my_level > < my_level - 1) > + { > + myInfo->my_level--; > + return; > + } > > If we removed that code, then we'd just do this instead: > > /* Pass up my inval messages to parent */ > AppendInvalidationMessages(&myInfo->parent->PriorCmdInvalidMsgs, >&myInfo->PriorCmdInvalidMsgs); > > /* Pending relcache inval becomes parent's problem too */ > if (myInfo->RelcacheInitFileInval) > myInfo->parent->RelcacheInitFileInval = true; Ick. I somehow misimagined that you'd just append them one layer further up. I obviously shouldn't review code during a conference. > > Even though there's still > > valid cache inval entries containing the innermost xact's version of the > > catalog, now not valid anymore. > > This part doesn't make sense to me either. The invalidation entries > don't put data into the caches; they take data out. When we change > stuff, we generate invalidation messages. What I was thinking was that you'd append the messages to the layer one level deeper than the parent. Then we'd missed the invalidations when rolling back the intermediate xact. But since I was quite mistaken above, this isn't a problem :) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X
On Fri, Oct 24, 2014 at 3:39 PM, Tom Lane wrote: > Peter, Dave: maybe you have tweaked things to keep listen_addresses > empty and rely only on Unix-socket connections? Should be so. The target of this feature is development on OSX, right? And most of the time development would be done only on the local machine, machine being most of the time a laptop. So instead of adding an optional step in configure to enforce the creation of a certificate, why not simply encourage people to use listen_addresses = '' on OSX by documenting it? Even when working on replication or related things on a local machine, it is possible to simply pass the socket directory... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API
On Fri, Oct 24, 2014 at 1:42 AM, Craig Ringer wrote: > On 10/23/2014 09:21 PM, Robert Haas wrote: >> Agreed - I think if you want an error check here it should use elog() >> or ereport(), not Assert(). > > That's what I originally did, but it's too early for elog. > > I'm reluctant to just fprintf(...) to stderr, as there's no way for the > user to suppress that, and it'll be emitted for each backend start. > Though on the other hand it really is a "shouldn't happen" case. > > So the options seem to be ignoring the error silently or printing to stderr. Either of those is OK with me. I think it's a bad idea to use Assert() to check the results of system calls, because an assertion failure is supposed to indicate a bug in our code, not Microsoft's code. But printing to stderr is an acceptable way of indicating an error that happens very early, and ignoring doesn't look unreasonable in this context either. Yet another option is to do nothing about the error at the time that it's reported but store the error code somewhere and use it to generate an error message once the system is initialized. I'm tentatively inclined to believe that's more machinery than this particular case justifies, but will happily defer to any emerging consensus. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Getting rid of "accept incoming network connections" prompts on OS X
On Wed, Oct 22, 2014 at 2:14 AM, edward745 wrote: > One of the queries in ri_triggers.c has be a little baffled. > > For (relatively) obvious reasons, a FK insert triggers a SELECT 1 FROM > pk_rel ... FOR KEY SHARE. > For not-so-obvious reasons, a PK delete triggers a SELECT 1 FROM fk_rel ... > FOR KEY SHARE. > > I can't see what the lock on fk_rel achieves. Both operations are already > contending for the lock on the PK row, which seems like enough to cover > every eventuality. > > And even if the lock serves a purpose, KEY SHARE is an odd choice, since the > referencing field is, in general, not a "key" in this sense. Please don't post unrelated questions onto existing mailing list threads. Start a new thread for a new topic. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deferring some AtStart* allocations?
On Fri, Oct 24, 2014 at 9:17 AM, Andres Freund wrote: >> If >> that subtransaction abouts, AtEOSubXact_Inval() gets called again, >> sees that it has messages (that it inherited from the innermost >> subtransaction), and takes the exact same code-path that it would have >> pre-patch. > > Consider what happens if the innermost transaction commits and the > second innermost one rolls back. AtEOSubXact_Inval() won't do anything > because it doesn't have any entries itself. This is the part I don't understand. After the innermost transaction commits, it *does* have entries itself. This whole block is basically just an optimization: + if (myInfo->parent == NULL || myInfo->parent->my_level < my_level - 1) + { + myInfo->my_level--; + return; + } If we removed that code, then we'd just do this instead: /* Pass up my inval messages to parent */ AppendInvalidationMessages(&myInfo->parent->PriorCmdInvalidMsgs, &myInfo->PriorCmdInvalidMsgs); /* Pending relcache inval becomes parent's problem too */ if (myInfo->RelcacheInitFileInval) myInfo->parent->RelcacheInitFileInval = true; That should be basically equivalent. I mean, we need a bit of surgery to ensure that the parent entry actually exists before we attach stuff to it, but that's mechanical. The whole point here though is that, pre-patch, the parent has an empty entry and we have one with stuff in it. What I think happens in the current code is that we take all of the stuff in our non-empty entry and move it into the parent's empty entry, so that the parent ends up with an entry identical to ours but with a level one less than we had. We could do that here too, manufacturing the empty entry and then moving our stuff into it and then deleting our original entry, but that seems silly; just change the level and call it good. IOW, I don't see how I'm *changing* the logic here at all. Why doesn't whatever problem you're concerned about exist in the current coding, too? > Even though there's still > valid cache inval entries containing the innermost xact's version of the > catalog, now not valid anymore. This part doesn't make sense to me either. The invalidation entries don't put data into the caches; they take data out. When we change stuff, we generate invalidation messages. At end of statement, once we've done CommandCounterIncrement(), we execute those invalidation messages locally, so that the next cache reload the updated state. At end of transaction, once we've committed, we send those invalidation messages to the shared queue to be executed by everyone else, so that they also see the updated state. If we abort a transaction or sub-transaction, we re-execute those same invalidation messages so that we flush the new state, forcing the next set of cache reloads to again pull in the old state. The patch isn't changing - or not intentionally anyway - anything about any of that logic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X
Michael Paquier writes: > On Fri, Oct 24, 2014 at 8:26 AM, Dave Page wrote: >> On Fri, Oct 24, 2014 at 7:18 AM, Peter Eisentraut wrote: >>> On 10/21/14 1:16 PM, Tom Lane wrote: If you do any Postgres development on OS X, you've probably gotten seriously annoyed by the way that, every single time you reinstall the postmaster executable, you get a dialog box asking whether you'd like to allow it to accept incoming network connections. >>> I used to, but somehow I don't see this anymore. Just to be sure, I >>> made sure the firewall is on, checked that postgres is not in the >>> exception list, rebooted, built postgresql from scratch, ran make check, >>> but no pop-up. >>> >>> I'm on Yosemite. Maybe this was changed. >> I've never seen it on any version of OS X (I've worked my way from >> Panther to Yosemite). There must be more to it... I see it every darn time I've changed the postmaster executable. Maybe there is a difference in security settings? I have the firewall enabled and in Settings->Security->General, "Allow apps downloaded from: Mac App Store and identified developers", which I think is the default. [ experiments... ] Hm, setting that to "Anywhere" doesn't change the results anyway. > FWIW, with firewall at on, I am used to see this annoying popup window when > starting an instance manually, make check never complains though. Ah. pg_regress sets listen_addresses to empty so that no TCP ports are opened, hence no firewall complaints from "make check". However, as soon as you start a normal installation, you get the complaint, as even an open port on 127.0.0.1 is enough to provoke it. Peter, Dave: maybe you have tweaked things to keep listen_addresses empty and rely only on Unix-socket connections? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trailing comma support in SELECT statements
On Tue, Oct 21, 2014 at 10:16 AM, Tom Lane wrote: > (Of course, I'm not for the feature w.r.t. SQL either. But breaking data > compatibility is just adding an entire new dimension of trouble. > Another dimension of the trouble is breaking the operation of the tools that parse SQL statements for various purposes, e.g. for dependency analysis. This is a misfeature for the benefit of edit-lazy users only. -- Alex
Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
On Fri, Oct 24, 2014 at 3:05 PM, Heikki Linnakangas wrote: > On 10/23/2014 11:09 AM, Heikki Linnakangas wrote: >> >> At least for master, we should consider changing the way the archiving >> works so that we only archive WAL that was generated in the same server. >> I.e. we should never try to archive WAL files belonging to another >> timeline. >> >> I just remembered that we discussed a different problem related to this >> some time ago, at >> >> http://www.postgresql.org/message-id/20131212.110002.204892575.horiguchi.kyot...@lab.ntt.co.jp. >> The conclusion of that was that at promotion, we should not archive the >> last, partial, segment from the old timeline. > > > So, this is what I came up with for master. Does anyone see a problem with > it? Thinking long-term, this is a solid approach, so +1 for it. I just tested the patch and the extra segment files do not show up anymore. Patch looks good as well. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and ranges
On 10/22/2014 03:01 PM, Teodor Sigaev wrote: Anyway GIN couldn't be used for ORDER BY clause. which would also be nice to fix... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deferring some AtStart* allocations?
On 2014-10-23 12:04:40 -0400, Robert Haas wrote: > On Tue, Oct 21, 2014 at 12:00 PM, Andres Freund > wrote: > > On 2014-10-09 15:01:19 -0400, Robert Haas wrote: > >> /* > >> @@ -960,18 +966,38 @@ AtEOXact_Inval(bool isCommit) > > ... > >> + /* > >> + * We create invalidation stack entries lazily, so the > >> parent might > >> + * not have one. Instead of creating one, moving all the > >> data over, > >> + * and then freeing our own, we can just adjust the level of > >> our own > >> + * entry. > >> + */ > >> + if (myInfo->parent == NULL || myInfo->parent->my_level < > >> my_level - 1) > >> + { > >> + myInfo->my_level--; > >> + return; > >> + } > >> + > > > > I think this bit might not be correct. What if the subxact one level up > > aborts? Then it'll miss dealing with these invalidation entries. Or am I > > misunderstanding something? > > One of us is. I think you're asking about a situation where we have a > transaction, and a subtransaction, and within that another > subtransaction. Only the innermost subtransaction has invalidation > messages. At the innermost level, we commit; the above code makes > those messages the responsibility of the outer subtransaction. Exactly. > If > that subtransaction abouts, AtEOSubXact_Inval() gets called again, > sees that it has messages (that it inherited from the innermost > subtransaction), and takes the exact same code-path that it would have > pre-patch. Consider what happens if the innermost transaction commits and the second innermost one rolls back. AtEOSubXact_Inval() won't do anything because it doesn't have any entries itself. Even though there's still valid cache inval entries containing the innermost xact's version of the catalog, now not valid anymore. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ARMv5
On Wed, Oct 22, 2014 at 6:31 AM, Александр Глухов wrote: > Hello, I have a problem with PostgreSQL. I need to install PostgreSQL on > ARMv5 with tcp/ip access, but I have no experience in it. I trying to do it > in LTIB and now I need to create a postgresql.spec file. Could you help me > in it? This is offtopic for -hackers. That being said, have you considered compiling postgres directly on the device? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and ranges
On 10/22/2014 03:01 PM, Teodor Sigaev wrote: Anyway GIN couldn't be used for ORDER BY clause. which would also be nice to fix... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function array_agg(array)
2014-10-24 13:58 GMT+02:00 Pavel Stehule : > > > 2014-10-24 11:43 GMT+02:00 Ali Akbar : > >> >> 2014-10-24 16:26 GMT+07:00 Pavel Stehule : >> >>> Hi >>> >>> some in last patch is wrong, I cannot to compile it: >>> >>> arrayfuncs.c: In function ‘accumArrayResult’: >>> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>astate->alen = 64; /* arbitrary starting array size */ >>> ^ >>> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named >>> ‘dvalues’ >>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >>> ^ >>> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >>> ^ >>> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named >>> ‘dnulls’ >>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >>> ^ >>> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >>> ^ >>> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named >>> ‘nelems’ >>>astate->nelems = 0; >>> ^ >>> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named >>> ‘nelems’ >>>if (astate->nelems >= astate->alen) >>> ^ >>> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>if (astate->nelems >= astate->alen) >>>^ >>> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’ >>> astate->alen *= 2; >>> >> >> Sorry, correct patch attached. >> >> This patch is in patience format (git --patience ..). In previous >> patches, i use context format (git --patience ... | filterdiff >> --format=context), but it turns out that some modification is lost. >> > > last version is compileable, but some is still broken > > postgres=# select array_agg(array[i, i+1, i-1]) > from generate_series(1,2) a(i); > ERROR: could not find array type for data type integer[] > I am sorry, it works - I had a problem with broken database I fixed small issue in regress tests and I enhanced tests for varlena types and null values. Regards Pavel > > but array(subselect) works > > postgres=# select array(select a from xx); >array > --- > {{1,2,3},{1,2,3}} > (1 row) > > Regards > > Pavel > > > >> >> -- >> Ali Akbar >> > > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index 7e5bcd9..f59738a *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** NULL baz(3 rows) *** 12046,12051 --- 12046,12067 + array_agg + +array_agg(anyarray) + + +any + + +the same array type as input type + + input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input. + + + + + average diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml new file mode 100644 index 2f0680f..8c182a4 *** a/doc/src/sgml/syntax.sgml --- b/doc/src/sgml/syntax.sgml *** SELECT ARRAY(SELECT oid FROM pg_proc WHE *** 2238,2243 --- 2238,2248 array --- {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413} + + SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i)); + array + --- + {{1},{2},{3},{4},{5}} (1 row) The subquery must return a single column. The resulting diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c new file mode 100644 index 41e973b..0261fcb *** a/src/backend/nodes/nodeFuncs.c --- b/src/backend/nodes/nodeFuncs.c *** exprType(const Node *expr) *** 108,119 type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(exprType((Node *) tent->expr); } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) --- 108,123 type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { ! if (!OidIsValid(get_element_type(type))) ! { ! /* not array, so check for its array type */ ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(exprType((Node
Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
On 10/23/2014 11:09 AM, Heikki Linnakangas wrote: At least for master, we should consider changing the way the archiving works so that we only archive WAL that was generated in the same server. I.e. we should never try to archive WAL files belonging to another timeline. I just remembered that we discussed a different problem related to this some time ago, at http://www.postgresql.org/message-id/20131212.110002.204892575.horiguchi.kyot...@lab.ntt.co.jp. The conclusion of that was that at promotion, we should not archive the last, partial, segment from the old timeline. So, this is what I came up with for master. Does anyone see a problem with it? - Heikki >From 9115276335d37d7d2dab55ff8b0642041c16bfc3 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Fri, 24 Oct 2014 15:39:02 +0300 Subject: [PATCH] Don't try to archive WAL from different timelines. Previously, we would always create a .done file for files restored from the archive, or streamed from a master server. That served two purposes: it allowed the files to be recycled at restartpoints, and ensured that we don't try to re-archive the files after promotion. However, there were some gaps in that approach: if streaming replication was stopped at a segment boundary, no .done file was created. The standby server might also crash just before it has created the .done file. This is a more wholesale solution to the problem. WAL files belonging to a different timeline are never archived, regardless of .done or .ready files. The rule is that the server that generates a WAL file is solely responsible for archiving it. This also changes the long-standing behavior at end of recovery, where we archived the last, partial, WAL segment from the old timeline. We no longer do that; if you copy any files manually to pg_xlog, and want them to be archived, you should copy them manually to the archive, too. (The first segment on the new timeline, which is copied from the last partial one, will be archived as usual.) This is the same idea as in commit c9cc7e05c6d82a9781883a016c70d95aa4923122, which was reverted in favor of the .done file approach in commit 1bd42cd70abdbc946ad64c3c8eaefed4bb8b1145. Turns out the .done file approach was not such a good idea after all. The behavior at end-of-recovery was discussed on pgsql-hackers back in February 2014 (20131212.110002.204892575.horiguchi.kyot...@lab.ntt.co.jp), but the plan was never followed through (my bad, I forgot all about). This should fix the issue reported by Jehan-Guillaume de Rorthais, where a standby server sometimes creates .ready files for WAL files streamed from the master. But this is too risky to backpatch - the changes in behavior are quite signicant - so we'll have to fix it with some other approach in back-branches. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e77af22..2370467 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -208,7 +208,8 @@ static int LocalXLogInsertAllowed = -1; * When ArchiveRecoveryRequested is set, archive recovery was requested, * ie. recovery.conf file was present. When InArchiveRecovery is set, we are * currently recovering using offline XLOG archives. These variables are only - * valid in the startup process. + * valid in the startup process, but there is a copy of InArchiveRecovery + * in shared memory. * * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're * currently performing crash recovery using only XLOG files in pg_xlog, but @@ -546,9 +547,11 @@ typedef struct XLogCtlData /* * SharedRecoveryInProgress indicates if we're still in crash or archive - * recovery. Protected by info_lck. + * recovery. SharedInArchiveRecovery indicates if we're currently in + * archive recovery. Protected by info_lck. */ bool SharedRecoveryInProgress; + bool SharedInArchiveRecovery; /* * SharedHotStandbyActive indicates if we're still in crash or archive @@ -748,6 +751,7 @@ static MemoryContext walDebugCxt = NULL; #endif static void readRecoveryCommandFile(void); +static void enterArchiveRecovery(void); static void exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo); static bool recoveryStopsBefore(XLogRecord *record); static bool recoveryStopsAfter(XLogRecord *record); @@ -4168,9 +4172,7 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode, { ereport(DEBUG1, (errmsg_internal("reached end of WAL in pg_xlog, entering archive recovery"))); -InArchiveRecovery = true; -if (StandbyModeRequested) - StandbyMode = true; +enterArchiveRecovery(); /* initialize minRecoveryPoint to this record */ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); @@ -4866,6 +4868,7 @@ XLOGShmemInit(void) */ XLogCtl->XLogCacheBlck = XLOGbuffers - 1; XLogCtl->SharedRecoveryInProgress = true; + XLogCtl->SharedInArchiveRecovery = false; XLogCtl->SharedHotStandbyActive = false;
Re: [HACKERS] Function array_agg(array)
Michael Paquier wrote: > That's not surprising, sometimes filterdiff misses the shot. Really? Wow, that's bad news. I've been using it to submit patches from time to time ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function array_agg(array)
Hi I did some performance tests and it is interesting: it is about 15% faster than original implementation. Regards Pavel 2014-10-24 13:58 GMT+02:00 Pavel Stehule : > > > 2014-10-24 11:43 GMT+02:00 Ali Akbar : > >> >> 2014-10-24 16:26 GMT+07:00 Pavel Stehule : >> >>> Hi >>> >>> some in last patch is wrong, I cannot to compile it: >>> >>> arrayfuncs.c: In function ‘accumArrayResult’: >>> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>astate->alen = 64; /* arbitrary starting array size */ >>> ^ >>> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named >>> ‘dvalues’ >>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >>> ^ >>> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >>> ^ >>> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named >>> ‘dnulls’ >>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >>> ^ >>> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >>> ^ >>> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named >>> ‘nelems’ >>>astate->nelems = 0; >>> ^ >>> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named >>> ‘nelems’ >>>if (astate->nelems >= astate->alen) >>> ^ >>> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’ >>>if (astate->nelems >= astate->alen) >>>^ >>> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’ >>> astate->alen *= 2; >>> >> >> Sorry, correct patch attached. >> >> This patch is in patience format (git --patience ..). In previous >> patches, i use context format (git --patience ... | filterdiff >> --format=context), but it turns out that some modification is lost. >> > > last version is compileable, but some is still broken > > postgres=# select array_agg(array[i, i+1, i-1]) > from generate_series(1,2) a(i); > ERROR: could not find array type for data type integer[] > > but array(subselect) works > > postgres=# select array(select a from xx); >array > --- > {{1,2,3},{1,2,3}} > (1 row) > > Regards > > Pavel > > > >> >> -- >> Ali Akbar >> > >
Re: [HACKERS] Function array_agg(array)
2014-10-24 11:43 GMT+02:00 Ali Akbar : > > 2014-10-24 16:26 GMT+07:00 Pavel Stehule : > >> Hi >> >> some in last patch is wrong, I cannot to compile it: >> >> arrayfuncs.c: In function ‘accumArrayResult’: >> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’ >>astate->alen = 64; /* arbitrary starting array size */ >> ^ >> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named >> ‘dvalues’ >>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >> ^ >> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’ >>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >> ^ >> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’ >>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >> ^ >> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’ >>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >> ^ >> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’ >>astate->nelems = 0; >> ^ >> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named >> ‘nelems’ >>if (astate->nelems >= astate->alen) >> ^ >> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’ >>if (astate->nelems >= astate->alen) >>^ >> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’ >> astate->alen *= 2; >> > > Sorry, correct patch attached. > > This patch is in patience format (git --patience ..). In previous patches, > i use context format (git --patience ... | filterdiff --format=context), > but it turns out that some modification is lost. > last version is compileable, but some is still broken postgres=# select array_agg(array[i, i+1, i-1]) from generate_series(1,2) a(i); ERROR: could not find array type for data type integer[] but array(subselect) works postgres=# select array(select a from xx); array --- {{1,2,3},{1,2,3}} (1 row) Regards Pavel > > -- > Ali Akbar >
Re: [HACKERS] Question about RI checks
On Oct23, 2014, at 17:45 , Kevin Grittner wrote: > Every way I look at it, inside a REPEATABLE READ or SERIALIZABLE > transaction a check for child rows when validating a parent DELETE > should consider both rows which exist according to the transaction > snapshot and according to a "current" snapshot. I've pondered this further, and unfortunately it seems that this isn't sufficient to guarantee true serializability :-( Verifying that both snapshots contain exactly the same rows does not prevent a child row from being inserted and immediately deleted again, not if both actions happen *after* the parent-updating transaction took its snapshot, but *before* it takes the crosscheck snapshot. Let parent(id) and child(id, parent_id) again be two tables with a FK constraint between them, let be initially empty, and let contain a single row (1). Assume PD is a transaction which deletes all rows from , CI a transaction which inserts the row (1, 1) into , and CD a transaction which deletes that row again. Even with the extended cross-checking you propose, we'd still allow the following concurrent schedule 1. PD takes snapshot 2. CI starts and completes 3. CD starts and completes 4. PD deletes from without complaining, since there were no conflicting rows at time (1), and none at time (4). So far, all is well. But add two more tables, called and , both initially containing one row. Let CI scan , let PD delete from and scan , and let CD delete from . In the concurrent schedule from above, CI will see the row in , and PD will delete it, and PD will see the row in that CD deletes. Note that SSI *will* allow that schedule to occur without raising a serialization error The only serial schedule which yields the same results for the various queries pertaining and is CI -> PD -> CD, i.e. PD has to run *between* CI and CD. But in that serial schedule, PD *should* see a FK key violation, since CI has inserted a child which CD hasn't yet deleted. There is thus *no* serial schedule which yields the same results as the concurrent schedule above for the queries pertaining and , *and* for the queries pertaining and , i.e the concurrent schedule is *not* serializable. Yet even the extended cross- check won't detect this. Attached is an isolationtester spec file which implements this example, and the corresponding out-file which shows that SSI permits the concurrent schedule. Since SSI doesn't concern itself with RI enforcement queries, it would also permit that schedule if we extended the cross-check, I think. (I used REL9_4_STABLE as of today to try this, commit 1cf54b00ba2100083390223a8244430643c1ec07) best regards, Florian Pflug fk-consistency2.spec Description: Binary data fk-consistency2.out Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function array_agg(array)
On Fri, Oct 24, 2014 at 11:43 AM, Ali Akbar wrote: > > 2014-10-24 16:26 GMT+07:00 Pavel Stehule : > >> Hi >> >> some in last patch is wrong, I cannot to compile it: >> >> arrayfuncs.c: In function 'accumArrayResult': >> arrayfuncs.c:4603:9: error: 'ArrayBuildState' has no member named 'alen' >>astate->alen = 64; /* arbitrary starting array size */ >> ^ >> arrayfuncs.c:4604:9: error: 'ArrayBuildState' has no member named >> 'dvalues' >>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >> ^ >> arrayfuncs.c:4604:44: error: 'ArrayBuildState' has no member named 'alen' >>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); >> ^ >> arrayfuncs.c:4605:9: error: 'ArrayBuildState' has no member named 'dnulls' >>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >> ^ >> arrayfuncs.c:4605:42: error: 'ArrayBuildState' has no member named 'alen' >>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); >> ^ >> arrayfuncs.c:4606:9: error: 'ArrayBuildState' has no member named 'nelems' >>astate->nelems = 0; >> ^ >> arrayfuncs.c:4618:13: error: 'ArrayBuildState' has no member named >> 'nelems' >>if (astate->nelems >= astate->alen) >> ^ >> arrayfuncs.c:4618:31: error: 'ArrayBuildState' has no member named 'alen' >>if (astate->nelems >= astate->alen) >>^ >> arrayfuncs.c:4620:10: error: 'ArrayBuildState' has no member named 'alen' >> astate->alen *= 2; >> > > Sorry, correct patch attached. > > This patch is in patience format (git --patience ..). In previous patches, > i use context format (git --patience ... | filterdiff --format=context), > but it turns out that some modification is lost. > That's not surprising, sometimes filterdiff misses the shot. -- Michael
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
On 2014-10-24 15:59:30 +0530, Amit Kapila wrote: > > > and w.r.t performance it can lead extra > > > function call, few checks and I think in some cases even can > > > acquire/release spinlock. > > > > I fail to see how that could be the case. > > Won't it happen incase first backend sets releaseOK to true and another > backend which tries to wakeup waiters on lock will acquire spinlock > and tries to release the waiters. Sure, that can happen. > > And again, this is code that's > > only executed around a couple syscalls. And the cacheline will be > > touched around there *anyway*. > > Sure, but I think syscalls are required in case we need to wake any > waiter. It won't wake up a waiter if there's none on the list. > > > > And it'd be a pretty pointless > > > > behaviour, leading to useless increased contention. The only time it'd > > > > make sense for X to be woken up is when it gets run faster than the S > > > > processes. > > > > > > Do we get any major benefit by changing the logic of waking up waiters? > > > > Yes. > > I think one downside I could see of new strategy is that the chance of > Exclusive waiter to take more time before getting woked up is increased > as now it will by pass Exclusive waiters in queue. Note that that *already* happens for any *new* shared locker that comes in. It doesn't really make sense to have share lockers queued behind the exclusive locker if others just go in front of it anyway. > > > Code is more readable, but I don't understand why you > > > want to do refactoring as part of this patch which ideally > > > doesn't get any benefit from the same. > > > > I did it first without. But there's required stuff like > > LWLockDequeueSelf(). And I had several bugs because of the list stuff. > > > > And I did separate the conversion into a separate patch? > > Yeah, but the main patch for wait free LW_SHARED also uses > it. Well, the only thing that it could have done given that the other patch is a preqrequisite is reverting the behaviour? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
On Wed, Oct 22, 2014 at 8:04 PM, Andres Freund wrote: > On 2014-10-21 19:56:05 +0530, Amit Kapila wrote: > > On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund > > wrote: > > spin_delay_count gives > > how much delay has happened to acquire spinlock which when > > combined with other stats gives the clear situation about > > the contention around aquisation of corresponding LWLock. > > Now if we want to count the spin lock delay for Release call > > as well, then the meaning of the stat is getting changed. > > It might be that new meaning of spin_delay_count stat is more > > useful in some situations, however the older one has its own > > benefits, so I am not sure if changing this as part of this > > patch is the best thing to do. > > In which case does the old definition make sense, where the new one > doesn't? I don't think it exists. > > And changing it here seems to make sense because spinlock contention > fundamentally changes it meaning for lwlocks anyway as in most paths we > don't take a spinlock anymore. On second thought, I think probably you are right here. > > > > Why can't we decrement the nwaiters after waking up? I don't think > > > > there is any major problem even if callers do that themselves, but > > > > in some rare cases LWLockRelease() might spuriously assume that > > > > there are some waiters and tries to call LWLockWakeup(). Although > > > > this doesn't create any problem, keeping the value sane is good unless > > > > there is some problem in doing so. > > > > > > That won't work because then LWLockWakeup() wouldn't be called when > > > necessary - precisely because nwaiters is 0. > > > > > The reason I've done so is that it's otherwise much harder to debug > > > issues where there are backend that have been woken up already, but > > > haven't rerun yet. Without this there's simply no evidence of that > > > state. I can't see this being relevant for performance, so I'd rather > > > have it stay that way. > > > > I am not sure what useful information we can get during debugging by not > > doing this in LWLockWakeup() > > It's useful because you can detect backends that have been scheduled to > acquire the lock, but haven't yet. They're otherwise "invisible". > > > and w.r.t performance it can lead extra > > function call, few checks and I think in some cases even can > > acquire/release spinlock. > > I fail to see how that could be the case. Won't it happen incase first backend sets releaseOK to true and another backend which tries to wakeup waiters on lock will acquire spinlock and tries to release the waiters. > And again, this is code that's > only executed around a couple syscalls. And the cacheline will be > touched around there *anyway*. Sure, but I think syscalls are required in case we need to wake any waiter. > > > > > > In the above code, if the first waiter to be woken up is Exclusive waiter, > > then it will woke that waiter, else shared waiters till it got > > the first exclusive waiter and then first exlusive waiter. > > That's would be bug then. I am not sure of it, but I think it's more important to validate the new waking startegy as you see benefits by doing so. >Per the comment you quoted "If the front > waiter wants exclusive lock, awaken him only. Otherwise awaken as many > waiters as want shared access.". > > But I don't think it's what's happening. Note that 'proc = > proc->lwWaitLink;' is only executed if 'proc->lwWaitLink->lwWaitMode != > LW_EXCLUSIVE'. Which is the next waiter... > > > > > And it'd be a pretty pointless > > > behaviour, leading to useless increased contention. The only time it'd > > > make sense for X to be woken up is when it gets run faster than the S > > > processes. > > > > Do we get any major benefit by changing the logic of waking up waiters? > > Yes. I think one downside I could see of new strategy is that the chance of Exclusive waiter to take more time before getting woked up is increased as now it will by pass Exclusive waiters in queue. I don't have any concrete proof that it can do any harm to performance, so may be it's okay to have this new mechanism, however I think it might be helpful if you could add a comment in code to explain the benefit by skipping Exclusive lockers. > > Code is more readable, but I don't understand why you > > want to do refactoring as part of this patch which ideally > > doesn't get any benefit from the same. > > I did it first without. But there's required stuff like > LWLockDequeueSelf(). And I had several bugs because of the list stuff. > > And I did separate the conversion into a separate patch? Yeah, but the main patch for wait free LW_SHARED also uses it. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback
> >> Sorry, I'm going around in the circle. But I'd like to say again, I > >> don't think this is good idea. It prevents asynchronous > >> pg_receivexlog from fsyncing WAL data and sending feedbacks more > >> frequently at all. They are useful, for example, when we want to > >> monitor the write location of asynchronous pg_receivexlog in almost > >> realtime. But if we adopt the idea, since feedback cannot be sent > >> soon in async mode, pg_stat_replication always returns the > not-up-to-date location. > > > > Why not send a message every 10 seconds when its not sync rep? > > Or even after every write(). It's a tiny amount of network traffic anyway. I understand that send feedback message frequently will keep pg_stat_replication up-to-date state. Are there really no needs who wants to fsync even in async mode ? I think the people who dislike Data lost will like that idea. Thought? Nevertheless in sync or async, returning feedback and executing fsync() same as like walreceiver is such a problem? -- Furuya Osamu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] detect custom-format dumps in psql and emit a useful error
On 2014-10-24 07:18:55 -0300, Alvaro Herrera wrote: > Jeevan Chalke wrote: > > > Regarding Directory Error: > > === > > I strongly against the proposal. This patch changing error message to > > something like this: > > "psql:blah:0: Input path is a directory. Use pg_restore to restore > > directory-format database dumps." > > > > So even though I accidentally provide a directory instead of a sql script > > file when I have NO intention of restoring a dump, above message looks > > weired. Instead current message looks perfectly fine here. i.e. > > "could not read from input file: Is a directory" > > > > psql always expect a file and NOT directory. Also it is not necessarily > > working on restoring a dump. > > Yeah, this patch is a lot more debatable than the other one. I have > pushed the first one without changing the error message. We could just test for toc.dat and then emit the warning... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] detect custom-format dumps in psql and emit a useful error
Jeevan Chalke wrote: > Regarding Directory Error: > === > I strongly against the proposal. This patch changing error message to > something like this: > "psql:blah:0: Input path is a directory. Use pg_restore to restore > directory-format database dumps." > > So even though I accidentally provide a directory instead of a sql script > file when I have NO intention of restoring a dump, above message looks > weired. Instead current message looks perfectly fine here. i.e. > "could not read from input file: Is a directory" > > psql always expect a file and NOT directory. Also it is not necessarily > working on restoring a dump. Yeah, this patch is a lot more debatable than the other one. I have pushed the first one without changing the error message. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TODO: Helpful hint from psql on createdb
Hi all Following on from my recent patch about adding a hint to psql when you try to use it to apply a custom-format dump: http://www.postgresql.org/message-id/544096e0.5020...@2ndquadrant.com I think we should do the same for the command line tools, like createdb, createuser, etc. I've seen this often enough: postgres=> createdb test postgres-> postgres-> and it should be solveable with the same approach I used for PGDMP. It's only a couple of hours work at most but I'm not going to get to it in a hurry, so let's pop it on the TODO for volunteers. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ARMv5
Hello, I have a problem with PostgreSQL. I need to install PostgreSQL on ARMv5 with tcp/ip access, but I have no experience in it. I trying to do it in LTIB and now I need to create a postgresql.spec file. Could you help me in it? With best regards, Alexandr Glukhov
Re: [HACKERS] Function array_agg(array)
2014-10-24 16:26 GMT+07:00 Pavel Stehule : > Hi > > some in last patch is wrong, I cannot to compile it: > > arrayfuncs.c: In function ‘accumArrayResult’: > arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’ >astate->alen = 64; /* arbitrary starting array size */ > ^ > arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named ‘dvalues’ >astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); > ^ > arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’ >astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); > ^ > arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’ >astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); > ^ > arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’ >astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); > ^ > arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’ >astate->nelems = 0; > ^ > arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named ‘nelems’ >if (astate->nelems >= astate->alen) > ^ > arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’ >if (astate->nelems >= astate->alen) >^ > arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’ > astate->alen *= 2; > Sorry, correct patch attached. This patch is in patience format (git --patience ..). In previous patches, i use context format (git --patience ... | filterdiff --format=context), but it turns out that some modification is lost. -- Ali Akbar diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7e5bcd9..f59738a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -12046,6 +12046,22 @@ NULL baz(3 rows) +array_agg + + array_agg(anyarray) + + + any + + + the same array type as input type + + input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input. + + + + + average diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 2f0680f..8c182a4 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%'); array --- {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413} + +SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i)); + array +--- + {{1},{2},{3},{4},{5}} (1 row) The subquery must return a single column. The resulting diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 41e973b..0261fcb 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -108,12 +108,16 @@ exprType(const Node *expr) type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { - type = get_array_type(type); - if (!OidIsValid(type)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("could not find array type for data type %s", - format_type_be(exprType((Node *) tent->expr); + if (!OidIsValid(get_element_type(type))) + { + /* not array, so check for its array type */ + type = get_array_type(type); + if (!OidIsValid(type)) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("could not find array type for data type %s", +format_type_be(exprType((Node *) tent->expr); + } } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) @@ -139,12 +143,16 @@ exprType(const Node *expr) type = subplan->firstColType; if (subplan->subLinkType == ARRAY_SUBLINK) { - type = get_array_type(type); - if (!OidIsValid(type)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("could not find array type for data type %s", - format_type_be(subplan->firstColType; + if (!OidIsValid(get_element_type(type))) + { + /* not array, so check for its array type */ + type = get_array_type(type); + if (!OidIsValid(type)) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("could not find array type for data type %s", + format_type_be(subplan->firstColType; + } } } else if (subplan->subLinkType == MULTIEXPR_SUBLINK) diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c index 3e7dc85..8fc8b49 100644 --- a/src/backend/optimizer/plan/subselect.c +++ b
Re: [HACKERS] Function array_agg(array)
Hi some in last patch is wrong, I cannot to compile it: arrayfuncs.c: In function ‘accumArrayResult’: arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->alen = 64; /* arbitrary starting array size */ ^ arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named ‘dvalues’ astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); ^ arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum)); ^ arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’ astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); ^ arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool)); ^ arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’ astate->nelems = 0; ^ arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named ‘nelems’ if (astate->nelems >= astate->alen) ^ arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’ if (astate->nelems >= astate->alen) ^ arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’ astate->alen *= 2; 2014-10-24 11:24 GMT+02:00 Ali Akbar : > 2014-10-24 15:48 GMT+07:00 Pavel Stehule : > >> Hi >> >> it looks well >> >> doc: >> http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS >> it should be fixed too >> >> Regards >> >> Pavel >> > > doc updated with additional example for array(subselect). patch attached. > > Regards, > -- > Ali Akbar >
Re: [HACKERS] Function array_agg(array)
2014-10-24 15:48 GMT+07:00 Pavel Stehule : > Hi > > it looks well > > doc: > http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS > it should be fixed too > > Regards > > Pavel > doc updated with additional example for array(subselect). patch attached. Regards, -- Ali Akbar *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 12046,12051 NULL baz(3 rows) --- 12046,12067 + array_agg + +array_agg(anyarray) + + +any + + +the same array type as input type + + input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input. + + + + + average *** a/doc/src/sgml/syntax.sgml --- b/doc/src/sgml/syntax.sgml *** *** 2238,2243 SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%'); --- 2238,2248 array --- {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413} + + SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i)); + array + --- + {{1},{2},{3},{4},{5}} (1 row) The subquery must return a single column. The resulting *** a/src/backend/nodes/nodeFuncs.c --- b/src/backend/nodes/nodeFuncs.c *** *** 108,119 exprType(const Node *expr) type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(exprType((Node *) tent->expr); } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) --- 108,123 type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { ! if (!OidIsValid(get_element_type(type))) ! { ! /* not array, so check for its array type */ ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(exprType((Node *) tent->expr); ! } } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) *** *** 139,150 exprType(const Node *expr) type = subplan->firstColType; if (subplan->subLinkType == ARRAY_SUBLINK) { ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(subplan->firstColType; } } else if (subplan->subLinkType == MULTIEXPR_SUBLINK) --- 143,158 type = subplan->firstColType; if (subplan->subLinkType == ARRAY_SUBLINK) { ! if (!OidIsValid(get_element_type(type))) ! { ! /* not array, so check for its array type */ ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(subplan->firstColType; ! } } } else if (subplan->subLinkType == MULTIEXPR_SUBLINK) *** a/src/backend/optimizer/plan/subselect.c --- b/src/backend/optimizer/plan/subselect.c *** *** 668,677 build_subplan(PlannerInfo *root, Plan *plan, PlannerInfo *subroot, Assert(!te->resjunk); Assert(testexpr == NULL); ! arraytype = get_array_type(exprType((Node *) te->expr)); ! if (!OidIsValid(arraytype)) ! elog(ERROR, "could not find array type for datatype %s", ! format_type_be(exprType((Node *) te->expr))); prm = generate_new_param(root, arraytype, exprTypmod((Node *) te->expr), --- 668,683 Assert(!te->resjunk); Assert(testexpr == NULL); ! ! arraytype = exprType((Node *) te->expr); ! if (!OidIsValid(get_element_type(arraytype))) ! { ! /* not array, so get the array type */ ! arraytype = get_array_type(exprType((Node *) te->expr)); ! if (!OidIsValid(arraytype)) ! elog(ERROR, "could not find array type for datatype %s", ! format_type_be(exprType((Node *) te->expr))); ! } prm = generate_new_param(root, arraytype, exprTypmod((Node *) te->expr), *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *** *** 16,22 #include "utils/builtins.h" #include "utils/lsyscache.h" - /*--
Re: [HACKERS] Function array_agg(array)
Hi it looks well doc: http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS it should be fixed too Regards Pavel 2014-10-24 10:24 GMT+02:00 Ali Akbar : > Updated patch attached. > > 2014-10-22 20:51 GMT+07:00 Pavel Stehule : > >> I agree with your proposal. I have a few comments to design: >> > >> 1. patch doesn't hold documentation and regress tests, please append >> it. >> > > i've added some regression tests in arrays.sql and aggregate.sql. > > I've only found the documentation of array_agg. Where is the doc for > array(subselect) defined? > > >> 2. this functionality (multidimensional aggregation) can be interesting >> more times, so maybe some interface like array builder should be >> preferred. >> > We already have accumArrayResult and > makeArrayResult/makeMdArrayResult, haven't we? > > Actually array_agg(anyarray) can be implemented by using > accumArrayResult and makeMdArrayResult, but in the process we will > deconstruct the array data and null bit-flags into > ArrayBuildState->dvalues > and dnulls. And we will reconstruct it through makeMdArrayResult. I think > this way it's not efficient, so i decided to reimplement it and memcpy the > array data and null flags as-is. > aha, so isn't better to fix a performance for accumArrayResult ? >>> >>> Ok, i'll go this route. While reading the code of array(subselect) as >>> you pointed below, i think the easiest way to implement support for both >>> array_agg(anyarray) and array(subselect) is to change accumArrayResult so >>> it operates both with scalar datum(s) and array datums, with performance >>> optimization for the latter. >>> >> > implemented it by modifying ArrayBuildState to ArrayBuildStateArray and > ArrayBuildStateScalar (do you have any idea for better struct naming?) > > >> In other places, i think it's clearer if we just use accumArrayResult and > makeArrayResult/makeMdArrayResult as API for building (multidimensional) > arrays. > > >> 3. array_agg was consistent with array(subselect), so it should be >> fixed too >> >> postgres=# select array_agg(a) from test; >>array_agg >> --- >> {{1,2,3,4},{1,2,3,4}} >> (1 row) >> >> postgres=# select array(select a from test); >> ERROR: could not find array type for data type integer[] >> > > I'm pretty new in postgresql development. Can you point out where is > array(subselect) implemented? > where you can start? postgres=# explain select array(select a from test); ERROR: 42704: could not find array type for data type integer[] LOCATION: exprType, nodeFuncs.c:116 look to code ... your magic keyword is a ARRAY_SUBLINK .. search in postgresql sources this keyword >>> >> attention: probably we don't would to allow arrays everywhere. >> > > I've changed the places where i think it's appropriate. > > >> 4. why you use a magic constant (64) there? >> >> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes); >> + astate->aitems = 64 * nitems; >> >> + astate->nullbitmap = (bits8 *) >> + repalloc(astate->nullbitmap, (astate->aitems + 7) / >> 8); >> > > just follow the arbitrary size choosen in accumArrayResult > (arrayfuncs.c): > astate->alen = 64; /* arbitrary starting array size */ > > it can be any number not too small and too big. Too small, and we will > realloc shortly. Too big, we will end up wasting memory. > you can try to alloc 1KB instead as start -- it is used more times in Postgres. Then a overhead is max 1KB per agg call - what is acceptable. You take this value from accumArrayResult, but it is targeted for shorted scalars - you should to expect so any array will be much larger. >>> > this patch allocates 1KB (1024 bytes) if the ndatabytes is < 512bytes. If > it is larger, it allocates 4 * size. For nullbitmap, it allocates 4 * > number of items in array. > > Regards, > -- > Ali Akbar > >
Re: [HACKERS] make pg_controldata accept "-D dirname"
On Thu, Sep 25, 2014 at 12:38 PM, Heikki Linnakangas < hlinnakan...@vmware.com> wrote: > On 09/24/2014 05:48 PM, Abhijit Menon-Sen wrote: > >> Updated patches attached. >> > > Thanks, applied some version of these. > This set of patches has been applied as b0d81ad but patch 0001 did not make it in, so pg_controldata is not yet able to accept -D: $ pg_controldata -D /foo/path pg_controldata: could not open file "-D/global/pg_control" for reading: No such file or directory $ pg_controldata /foo/path pg_controldata: could not open file "/foo/path/global/pg_control" for reading: No such file or directory -- Michael
Re: [HACKERS] alter user/role CURRENT_USER
Hi, here is the revised patch. Attached files are the followings - 0001-ALTER-ROLE-CURRENT_USER_v2.patch - the patch. - testset.tar.bz2 - test set. Run by typing 'make check' as a superuser of the running postgreSQL server. It creates "testdb" and some roles. Documents are not edited this time. Considering your comments, I found more points to modify. - CREATE SCHEMA (IF NOT EXISTS) .. AUTHORIZATION ... - ALTER AGGREAGE/COLLATION/etc... OWNER TO - CREATE/ALTER/DROP USER MAPPING FOR SERVER .. GRANT/REVOKE also takes role as an arguemnt but CURRENT_USER and the similar keywords seem to be useless for them. Finally, the new patch modifies the following points. In gram.y, - RoleId's are replaced with RoleId_or_curruser in more places. It accepts CURRENT_USER/USER/CURRENT_ROLE/SESSION_USER. - ALTER USER ALL syntax is added. (not changed from the previous patch) - The non-terminal auth_ident now uses RoleId_or_curruser instead of RoleId. This affects CREATE/ALTER/DROP USER MAPPING In user.c, new function ResolveRoleId() is added and used for all role ID resolutions that correspond to the syntax changes in parser. It is AlterRole() in user.c. In foreigncmds.c, GetUserOidFromMapping() is removed and ResolveRoleId is used instead. In alter.c and tablecmds.c, all calls to get_role_oid() correspond the the grammer change were replaced with ResolveRoleId(). The modifications are a bit complicated so I provided a comprehensive test set. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From f55494a49b6d4c7eb32665ea9cc63634f5000c99 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Tue, 9 Sep 2014 19:26:24 +0900 Subject: [PATCH] ALTER ROLE CURRENT_USER --- src/backend/commands/alter.c | 2 +- src/backend/commands/foreigncmds.c | 25 ++-- src/backend/commands/schemacmds.c | 3 +- src/backend/commands/tablecmds.c | 2 +- src/backend/commands/user.c| 56 ++- src/backend/parser/gram.y | 78 ++ src/include/commands/dbcommands.h | 1 + src/include/commands/user.h| 2 + 8 files changed, 96 insertions(+), 73 deletions(-) diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c index c9a9baf..15f254e 100644 --- a/src/backend/commands/alter.c +++ b/src/backend/commands/alter.c @@ -678,7 +678,7 @@ AlterObjectNamespace_internal(Relation rel, Oid objid, Oid nspOid) Oid ExecAlterOwnerStmt(AlterOwnerStmt *stmt) { - Oid newowner = get_role_oid(stmt->newowner, false); + Oid newowner = ResolveRoleId(stmt->newowner, false, NULL); switch (stmt->objectType) { diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c index ab4ed6c..9878fca 100644 --- a/src/backend/commands/foreigncmds.c +++ b/src/backend/commands/foreigncmds.c @@ -27,6 +27,7 @@ #include "catalog/pg_type.h" #include "catalog/pg_user_mapping.h" #include "commands/defrem.h" +#include "commands/user.h" #include "foreign/fdwapi.h" #include "foreign/foreign.h" #include "miscadmin.h" @@ -198,24 +199,6 @@ transformGenericOptions(Oid catalogId, /* - * Convert the user mapping user name to OID - */ -static Oid -GetUserOidFromMapping(const char *username, bool missing_ok) -{ - if (!username) - /* PUBLIC user mapping */ - return InvalidOid; - - if (strcmp(username, "current_user") == 0) - /* map to the owner */ - return GetUserId(); - - /* map to provided user */ - return get_role_oid(username, missing_ok); -} - -/* * Internal workhorse for changing a data wrapper's owner. * * Allow this only for superusers; also the new owner must be a @@ -1099,7 +1082,7 @@ CreateUserMapping(CreateUserMappingStmt *stmt) rel = heap_open(UserMappingRelationId, RowExclusiveLock); - useId = GetUserOidFromMapping(stmt->username, false); + useId = ResolveRoleId(stmt->username, false, NULL); /* Check that the server exists. */ srv = GetForeignServerByName(stmt->servername, false); @@ -1194,7 +1177,7 @@ AlterUserMapping(AlterUserMappingStmt *stmt) rel = heap_open(UserMappingRelationId, RowExclusiveLock); - useId = GetUserOidFromMapping(stmt->username, false); + useId = ResolveRoleId(stmt->username, false, NULL); srv = GetForeignServerByName(stmt->servername, false); umId = GetSysCacheOid2(USERMAPPINGUSERSERVER, @@ -1276,7 +1259,7 @@ RemoveUserMapping(DropUserMappingStmt *stmt) Oid umId; ForeignServer *srv; - useId = GetUserOidFromMapping(stmt->username, stmt->missing_ok); + useId = ResolveRoleId(stmt->username, stmt->missing_ok, NULL); srv = GetForeignServerByName(stmt->servername, true); if (stmt->username && !OidIsValid(useId)) diff --git a/src/backend/commands/schemacmds.c b/src/backend/commands/schemacmds.c index 03f5514..4f97f23 100644 --- a/src/backend/commands/schemacmds.c +++ b/src/backend/commands/schemacmds.c @@ -25,6 +25,7 @@ #include "catalog/pg_namespace.h" #include "commands/dbcommands.h" #include "com
Re: [HACKERS] Function array_agg(array)
Updated patch attached. 2014-10-22 20:51 GMT+07:00 Pavel Stehule : > I agree with your proposal. I have a few comments to design: > > 1. patch doesn't hold documentation and regress tests, please append > it. > i've added some regression tests in arrays.sql and aggregate.sql. I've only found the documentation of array_agg. Where is the doc for array(subselect) defined? > 2. this functionality (multidimensional aggregation) can be interesting > more times, so maybe some interface like array builder should be > preferred. > We already have accumArrayResult and makeArrayResult/makeMdArrayResult, haven't we? Actually array_agg(anyarray) can be implemented by using accumArrayResult and makeMdArrayResult, but in the process we will deconstruct the array data and null bit-flags into ArrayBuildState->dvalues and dnulls. And we will reconstruct it through makeMdArrayResult. I think this way it's not efficient, so i decided to reimplement it and memcpy the array data and null flags as-is. >>> >>> aha, so isn't better to fix a performance for accumArrayResult ? >>> >> >> Ok, i'll go this route. While reading the code of array(subselect) as you >> pointed below, i think the easiest way to implement support for both >> array_agg(anyarray) and array(subselect) is to change accumArrayResult so >> it operates both with scalar datum(s) and array datums, with performance >> optimization for the latter. >> > implemented it by modifying ArrayBuildState to ArrayBuildStateArray and ArrayBuildStateScalar (do you have any idea for better struct naming?) > In other places, i think it's clearer if we just use accumArrayResult and makeArrayResult/makeMdArrayResult as API for building (multidimensional) arrays. > 3. array_agg was consistent with array(subselect), so it should be > fixed too > > postgres=# select array_agg(a) from test; >array_agg > --- > {{1,2,3,4},{1,2,3,4}} > (1 row) > > postgres=# select array(select a from test); > ERROR: could not find array type for data type integer[] > I'm pretty new in postgresql development. Can you point out where is array(subselect) implemented? >>> >>> where you can start? >>> >>> postgres=# explain select array(select a from test); >>> ERROR: 42704: could not find array type for data type integer[] >>> LOCATION: exprType, nodeFuncs.c:116 >>> >>> look to code ... your magic keyword is a ARRAY_SUBLINK .. search in >>> postgresql sources this keyword >>> >> > attention: probably we don't would to allow arrays everywhere. > I've changed the places where i think it's appropriate. > 4. why you use a magic constant (64) there? > > + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes); > + astate->aitems = 64 * nitems; > > + astate->nullbitmap = (bits8 *) > + repalloc(astate->nullbitmap, (astate->aitems + 7) / > 8); > just follow the arbitrary size choosen in accumArrayResult (arrayfuncs.c): astate->alen = 64; /* arbitrary starting array size */ it can be any number not too small and too big. Too small, and we will realloc shortly. Too big, we will end up wasting memory. >>> >>> you can try to alloc 1KB instead as start -- it is used more times in >>> Postgres. Then a overhead is max 1KB per agg call - what is acceptable. >>> >>> You take this value from accumArrayResult, but it is targeted for >>> shorted scalars - you should to expect so any array will be much larger. >>> >> this patch allocates 1KB (1024 bytes) if the ndatabytes is < 512bytes. If it is larger, it allocates 4 * size. For nullbitmap, it allocates 4 * number of items in array. Regards, -- Ali Akbar *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 12046,12051 NULL baz(3 rows) --- 12046,12067 + array_agg + +array_agg(anyarray) + + +any + + +the same array type as input type + + input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input. + + + + + average *** a/src/backend/nodes/nodeFuncs.c --- b/src/backend/nodes/nodeFuncs.c *** *** 108,119 exprType(const Node *expr) type = exprType((Node *) tent->expr); if (sublink->subLinkType == ARRAY_SUBLINK) { ! type = get_array_type(type); ! if (!OidIsValid(type)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("could not find array type for data type %s", ! format_type_be(exprType((Node *) tent->expr); } } else if (sublink->subLinkType == MULTIEXPR_SUBLINK) --- 108,123 t
[HACKERS] Re: Getting rid of "accept incoming network connections" prompts on OS X
One of the queries in ri_triggers.c has be a little baffled. For (relatively) obvious reasons, a FK insert triggers a SELECT 1 FROM pk_rel ... FOR KEY SHARE. For not-so-obvious reasons, a PK delete triggers a SELECT 1 FROM fk_rel ... FOR KEY SHARE. I can't see what the lock on fk_rel achieves. Both operations are already contending for the lock on the PK row, which seems like enough to cover every eventuality. And even if the lock serves a purpose, KEY SHARE is an odd choice, since the referencing field is, in general, not a "key" in this sense. - aaa -- View this message in context: http://postgresql.1045698.n5.nabble.com/Getting-rid-of-accept-incoming-network-connections-prompts-on-OS-X-tp5823819p5823890.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
On Wed, Oct 22, 2014 at 7:12 PM, Andres Freund wrote: > > On 2014-10-22 13:32:07 +0530, Amit Kapila wrote: > > On Tue, Oct 21, 2014 at 7:56 PM, Amit Kapila > > wrote: > > > On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund > > wrote: > > > > > > > > On 2014-06-25 19:06:32 +0530, Amit Kapila wrote: > > > > Today, I have verified all previous comments raised by > > me and looked at new code and below are my findings: > > > > >> > > >> 4. > > >> LWLockAcquireCommon() > > >> { > > >> .. > > >> if (!LWLockDequeueSelf(l)) > > >> { > > >> for (;;) > > >> { > > >> PGSemaphoreLock(&proc->sem, false); > > >> if (!proc->lwWaiting) > > >> break; > > >> extraWaits++; > > >> } > > >> lock->releaseOK = true; > > >> .. > > >> } > > >> > > >> Setting releaseOK in above context might not be required because if the > > >> control comes in this part of code, it will not retry to acquire another > > >> time. > > > > > Hm. You're probably right. > > > > You have agreed to fix this comment, but it seems you have forgot > > to change it. > > After I've thought more about it, it's is actually required. This > essentially *is* a retry. Won't it needs to be set before retry? Whats the use of setting it when we have got the lock and we are not going to retry. > Someobdy woke us up, which is where releaseOK is supposed to be set. I think that is true only in case when we are again going to retry or atleast that seems to be the mechanism used currently in LWLockAcquireCommon. > > > >> 11. > > >> LWLockRelease() > > >> { > > >> .. > > >> PRINT_LWDEBUG("LWLockRelease", lock, mode); > > >> } > > >> > > >> Shouldn't this be in begining of LWLockRelease function rather than > > >> after processing held_lwlocks array? > > > > > Ok. > > > > You have agreed to fix this comment, but it seems you have forgot > > to change it. > > > > > Below comment doesn't seem to be adressed? > > > > > LWLockAcquireOrWait() > > > { > > > .. > > > LOG_LWDEBUG("LWLockAcquireOrWait", lockid, mode, "succeeded"); > > > .. > > > } > > > > > a. such a log is not there in any other LWLock.. variants, > > > if we want to introduce it, then shouldn't it be done at > > > other places as well. > > I think you're placing unneccessarily high consistency constraints on a > debugging feature here. This was just a very minor suggestion to keep code consistent, which if you want to ignore is okay. I understand that having or not having code consistent for this doesn't matter. > > Below point is yet to be resolved. > > > > > > 12. > > > > #ifdef LWLOCK_DEBUG > > > > lock->owner = MyProc; > > > > #endif > > > > > > > > Shouldn't it be reset in LWLockRelease? > > > > > > That's actually intentional. It's quite useful to know the last owner > > > when debugging lwlock code. > > > > Won't it cause any problem if the last owner process exits? > > No. PGPROCs aren't deallocated or anything. And it's a debugging only > variable. Thats right, the problem I was thinking is of wrong information. Ex. if process holding Exclusive locker has exited and then lot of other processes took shared locks and one new Exclusive locker waits on getting the lock, at that moment during debugging we can get wrong information about lock owner. However I think you are mainly worried about situtions when many backends are waiting for Exclusive locker which is probably the most common scenario. > > Can you explain how pg_read_barrier() in below code makes this > > access safe? > > > > LWLockWakeup() > > { > > .. > > + pg_read_barrier(); /* pairs with nwaiters-- */ > > + if (!BOOL_ACCESS_ONCE(lock->releaseOK)) > > .. > > } > > What's the concern you have? Full memory barriers (the atomic > nwaiters--) pair with read memory barriers. IIUC, then pairing with nwaiters in LWLockAcquireCommon() ensures that releaseOK is set before again attemting for lock as atomic operation provides the necessary barrier. The point I am not getting is what kind of guarantee pg_read_barrier() provides us or why is it required? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X
On Fri, Oct 24, 2014 at 8:26 AM, Dave Page wrote: > On Fri, Oct 24, 2014 at 7:18 AM, Peter Eisentraut wrote: > > On 10/21/14 1:16 PM, Tom Lane wrote: > >> If you do any Postgres development on OS X, you've probably gotten > >> seriously annoyed by the way that, every single time you reinstall the > >> postmaster executable, you get a dialog box asking whether you'd like > >> to allow it to accept incoming network connections. > > > > I used to, but somehow I don't see this anymore. Just to be sure, I > > made sure the firewall is on, checked that postgres is not in the > > exception list, rebooted, built postgresql from scratch, ran make check, > > but no pop-up. > > > > I'm on Yosemite. Maybe this was changed. > > I've never seen it on any version of OS X (I've worked my way from > Panther to Yosemite). There must be more to it... > FWIW, with firewall at on, I am used to see this annoying popup window when starting an instance manually, make check never complains though. -- Michael