Re: [HACKERS] PG 10 release notes
On 4/24/17 8:31 PM, Bruce Momjian wrote: I have committed the first draft of the Postgres 10 release notes. They are current as of two days ago, and I will keep them current. Please give me any feedback you have. The only unusual thing is that this release has ~180 items while most recent release have had ~220. The pattern I see that there are more large features in this release than previous ones. Can you change the attribution on Allow PL/Tcl functions to return composite types and sets to Karl Lehenbauer? He actually wrote the original patch; I just helped to get it through the community (something that FlightAware paid for). I didn't realize at the time that you could change the listed Author in the commitfest. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/6/17 9:21 PM, Andres Freund wrote: Personally I'm way more excited about what a SPI feature like this could do for plpgsql than about what it can do for plpython. If the latter is what floats your boat, that's fine; but I want a feature that we can build on for other uses, not a hack that we know we need to redesign next month. Yeah, I thought about plpgsql and I can't see any way to make that work through an SPI callback (perhaps just due to my ignorance on things C). I suspect what plpgsql actually wants is a way to tell SPI to start the executor up, a function that pulls individual tuples out of the executor, and then a function to shut the executor down. Dislike of the proposed implementation, alternative proposals, and the refutation of the "absolutely no way to do more without breaking plpy" argument leads to me to conclude that this should be returned with feedback. Agreed. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On Apr 6, 2017, at 9:10 PM, Andres Freundwrote: > >>> Why? We could very well return a somewhat "smarter" object. Returning >>> rows row-by-row if accessed via iterator, materializes when accessed via >>> row offset. >> >> I completely agree with that. What I don't understand is the objection to >> speeding up the old access method. Or are you thinking we'd just abandon the >> old method? > > What I'm saying is that we can do that transparently, with the current > API. And there's no need to materialize anything in plpython, we can > transparently use the SPI materialized version. Oh, just switching from a list to an iterator. Ok, I finally get it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/6/17 9:04 PM, Andres Freund wrote: On 2017-04-06 09:14:43 -0700, Jim Nasby wrote: On 4/6/17 9:04 AM, Peter Eisentraut wrote: On 4/6/17 03:50, Craig Ringer wrote: But otherwise, pending docs changes, I think it's ready for committer. My opinion is still that this is ultimately the wrong approach. The right fix for performance issues in PL/Python is to change PL/Python not to materialize the list of tuples. Now with this change we would be moving from two result materializations to one, but I think we are keeping the wrong one. That's an option for future improvement, but I see no way to accomplish that without completely breaking plpy. Why? We could very well return a somewhat "smarter" object. Returning rows row-by-row if accessed via iterator, materializes when accessed via row offset. I completely agree with that. What I don't understand is the objection to speeding up the old access method. Or are you thinking we'd just abandon the old method? -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/6/17 8:13 PM, Tom Lane wrote: It's on the pointy end for Pg10, and I thought we'd be fine to include this in pg10 then aim to clean up DestReceiver in early pg11, or even as a post-feature-freeze refactoring fixup in pg10. Should the callback approach be blocked because the API it has to use is a bit ugly? Given Peter's objections, I don't think this is getting into v10 anyway, so we might as well take a bit more time and do it right. Well, Peter's objection is that we're not going far enough in plpython, but there's absolutely no way to do more without breaking plpy, which seems a non-starter. We should certainly be able to expand the existing API to provide even more benefit, but I see no reason to leave the performance gain this patch provides on the floor just because there's more to be had with a different API. Also, I'm entirely -1 on "post-feature-freeze refactoring fixups". We're going to have more than enough to do trying to stabilize the existing committed code, I fear (cf Robert's pessimistic summary of the open-items list, a couple days ago). We don't need to be planning on doing new design post-freeze, whether it's painted as mere refactoring or not. Agreed, and I agree that the current patch is a bit of a hack when it comes to DestReceiver (or really, DestReceiver has become an ugly wart over the years, as you pointed out). I'll plan to pick this up again once the dust settles on this commitfest. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/6/17 9:04 AM, Peter Eisentraut wrote: On 4/6/17 03:50, Craig Ringer wrote: But otherwise, pending docs changes, I think it's ready for committer. My opinion is still that this is ultimately the wrong approach. The right fix for performance issues in PL/Python is to change PL/Python not to materialize the list of tuples. Now with this change we would be moving from two result materializations to one, but I think we are keeping the wrong one. That's an option for future improvement, but I see no way to accomplish that without completely breaking plpy. I think the best way to handle this would be to allow plpython functions to define their own callback function, which would be handed a python tuple that was translated from the SPI result tuple. How best to do that without breaking plpy will require some thought though. In the meantime, I don't think a 27% performance gain is anything to sneeze at, and the SPI changes would be directly usable by pl/r and pl/tcl. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/5/17 9:08 PM, Craig Ringer wrote: ... which I can't reproduce now. Even though I cleared ccache and "git reset -fdx" before I ran the above and got the crash. Glad to hear that, since I can't repro at all. :) Assume it's a local system peculiarity. If I can reproduce again I'll dig into it. Sounds good. Thanks! -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/5/17 7:44 PM, Jim Nasby wrote: Updated patches attached, but I still need to update the docs. Attached is a complete series of patches that includes the docs patch. Right now, the docs don't include a concrete example, because adding one would be a pretty large if it demonstrated real usage, which presumably means Yet Another Contrib Module strictly for the purpose of demonstrating something. Rather than doing that, ISTM it'd be better to point the user at what plpythonu is doing. Another option would be to have a very simple example that only uses *receiveSlot, but that seems rather pointless to me. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com From 0a2ef661f55a047763a43b0eebd7483760e4a427 Mon Sep 17 00:00:00 2001 From: Jim Nasby <jim.na...@bluetreble.com> Date: Wed, 5 Apr 2017 20:52:39 -0500 Subject: [PATCH 1/3] Add SPI_execute_callback Instead of placing results in a tuplestore, this method of execution uses the supplied callback when creating the Portal for a query. --- src/backend/executor/spi.c | 80 -- src/backend/tcop/dest.c| 15 + src/include/executor/spi.h | 4 +++ src/include/tcop/dest.h| 1 + 4 files changed, 90 insertions(+), 10 deletions(-) diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index ca547dc6d9..4f6c3011f9 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan); static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, Snapshot snapshot, Snapshot crosscheck_snapshot, - bool read_only, bool fire_triggers, uint64 tcount); + bool read_only, bool fire_triggers, uint64 tcount, + DestReceiver *callback); static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes, Datum *Values, const char *Nulls); @@ -321,7 +322,35 @@ SPI_execute(const char *src, bool read_only, long tcount) res = _SPI_execute_plan(, NULL, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +int +SPI_execute_callback(const char *src, bool read_only, long tcount, + DestReceiver *callback) +{ + _SPI_plan plan; + int res; + + if (src == NULL || tcount < 0) + return SPI_ERROR_ARGUMENT; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + memset(, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.cursor_options = 0; + + _SPI_prepare_oneshot_plan(src, ); + + res = _SPI_execute_plan(, NULL, + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -355,7 +384,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const char *Nulls, _SPI_convert_params(plan->nargs, plan->argtypes, Values, Nulls), InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +/* Execute a previously prepared plan with a callback */ +int +SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls, +bool read_only, long tcount, DestReceiver *callback) +{ + int res; + + if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0) + return SPI_ERROR_ARGUMENT; + + if (plan->nargs > 0 && Values == NULL) + return SPI_ERROR_PARAM; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + res = _SPI_execute_plan(plan, + _SPI_convert_params(plan->nargs, plan->argtypes, + Values, Nulls), + InvalidSnapshot, InvalidSnapshot, + read_only, true,
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
ferences has it's OID changed then the tupledesc will be invalid. I'm * not sure it's worth worrying about that though. */ Updated patches attached, but I still need to update the docs. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com From 0a2ef661f55a047763a43b0eebd7483760e4a427 Mon Sep 17 00:00:00 2001 From: Jim Nasby <jim.na...@bluetreble.com> Date: Wed, 5 Apr 2017 20:52:39 -0500 Subject: [PATCH 1/2] Add SPI_execute_callback Instead of placing results in a tuplestore, this method of execution uses the supplied callback when creating the Portal for a query. --- src/backend/executor/spi.c | 80 -- src/backend/tcop/dest.c| 15 + src/include/executor/spi.h | 4 +++ src/include/tcop/dest.h| 1 + 4 files changed, 90 insertions(+), 10 deletions(-) diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index ca547dc6d9..4f6c3011f9 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan); static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, Snapshot snapshot, Snapshot crosscheck_snapshot, - bool read_only, bool fire_triggers, uint64 tcount); + bool read_only, bool fire_triggers, uint64 tcount, + DestReceiver *callback); static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes, Datum *Values, const char *Nulls); @@ -321,7 +322,35 @@ SPI_execute(const char *src, bool read_only, long tcount) res = _SPI_execute_plan(, NULL, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +int +SPI_execute_callback(const char *src, bool read_only, long tcount, + DestReceiver *callback) +{ + _SPI_plan plan; + int res; + + if (src == NULL || tcount < 0) + return SPI_ERROR_ARGUMENT; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + memset(, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.cursor_options = 0; + + _SPI_prepare_oneshot_plan(src, ); + + res = _SPI_execute_plan(, NULL, + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -355,7 +384,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const char *Nulls, _SPI_convert_params(plan->nargs, plan->argtypes, Values, Nulls), InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +/* Execute a previously prepared plan with a callback */ +int +SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls, +bool read_only, long tcount, DestReceiver *callback) +{ + int res; + + if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0) + return SPI_ERROR_ARGUMENT; + + if (plan->nargs > 0 && Values == NULL) + return SPI_ERROR_PARAM; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + res = _SPI_execute_plan(plan, + _SPI_convert_params(plan->nargs, plan->argtypes, + Values, Nulls), + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -384,7 +440,7 @@ SPI_execute_plan_with_paramlist(SPIPlanPtr plan, ParamListInfo params, res = _SPI_execute_plan(plan, params, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); +
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 4/4/17 7:44 PM, Craig Ringer wrote: The patch crashes in initdb with --enable-cassert builds: Thanks for the review! I'll get to the rest of it in a bit, but I'm unable to reproduce the initdb failure. I looked at the assert line and I don't see anything obvious either. :/ Can you send your full configure call? uname -a? Mine is: ./configure --with-includes=/opt/local/include --with-libraries=/opt/local/lib --enable-debug --with-libxml --with-tcl --with-perl --with-python --enable-depend --enable-dtrace --enable-tap-tests --prefix=/Users/decibel/pgsql/HEAD/i/i --with-pgport=$PGC_PORT -C --enable-cassert --enable-debug CFLAGS='-ggdb -O0 -fno-omit-frame-pointer' Darwin decina.local 15.6.0 Darwin Kernel Version 15.6.0: Mon Jan 9 23:07:29 PST 2017; root:xnu-3248.60.11.2.1~1/RELEASE_X86_64 x86_64 -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Missing increment of vacrelstats->pinskipped_pages
lazy_vacuum_heap() does not count pages that it skips due to not obtaining the buffer cleanup lock. vacuum_pinskipped.patch fixes that. That should be backpatched to 9.5. vacuum_comment.patch cleans up a comment in lazy_scan_heap(). -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 5b43a66bdc..6f7a5b4818 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -530,9 +530,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, * safely set for relfrozenxid or relminmxid. * * Before entering the main loop, establish the invariant that -* next_unskippable_block is the next block number >= blkno that's not we -* can't skip based on the visibility map, either all-visible for a -* regular scan or all-frozen for an aggressive scan. We set it to +* next_unskippable_block is the next block number >= blkno that we +* can't skip based on the visibility map (either all-visible for a +* regular scan or all-frozen for an aggressive scan). We set it to * nblocks if there's no such block. We also set up the skipping_blocks * flag correctly at this stage. * diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 5b43a66bdc..5a5a4ba48b 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -1404,6 +1404,7 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats) if (!ConditionalLockBufferForCleanup(buf)) { ReleaseBuffer(buf); + vacrelstats->pinskipped_pages++; ++tupindex; continue; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python: Add cursor and execute methods to plan object
On 2/25/17 10:27 AM, Peter Eisentraut wrote: So I'm also wondering here which style people prefer so I can implement it there. I think the more OO style is definitely better. I expect it would simplify the code as well. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Privilege checks on array coercions
On 3/23/17 12:37 PM, Andres Freund wrote: On 2017-03-23 15:26:51 -0400, Tom Lane wrote: There is a test in privileges.sql (currently lines 589-625 in privileges.out) that seems to be dependent on the fact that the ArrayCoerceExpr logic doesn't check for EXECUTE privilege on the per-element type coercion function if it's dealing with a NULL input array. ... Does anyone want to defend this privileges test case as testing for some behavior that users expect? Not me - that seems quite sensible to change. I'd even argue that existing behavior is a bug. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)
On 3/19/17 2:32 PM, Tom Lane wrote: Jim Nasby <jim.na...@openscg.com> writes: Having thought about it, I share Tom's concerns. And now I'm worried about what happens if there are multiple separate OR clauses. I guess those would be handled by separate UNIONs? As proposed, the patch would try to optimize by splitting each OR clause independently, and would choose whichever way gave the best cost estimate. I'm not sure it's possible to do better than that, and even if it is, I think improving it could be left for later. Agreed. I'd also considered an approach of de-duping on the basis of all relation ctids, while allowing a rel's ctid to be returned as NULL from a UNION arm in which the rel was eliminated entirely. But that doesn't fix it, because in this example the first arm would return (a.ctid, NULL) while the second arm would return (NULL, b.ctid), so that the UNION step would still fail to detect any duplication. To make it work, we'd have to not eliminate the joins, which would pretty much defeat the usefulness of the optimization for your original example case. It might still be worth-while in some circumstances. In your example, if there were these indexes: a__id ON a(id), a__x ON a(x) b__id ON b(id), b__y ON b(y) then it might be faster to nested loop a__x=42 to b__id=a.id and union that with b__y=43 nested to a__id=b.id. That said, now isn't the time to be adding more complexity. So full joins definitely break this whole optimization. I think it's okay with left joins though, because when starting from "a left join b" it will never be possible to remove "a" so we'll always include a.ctid in the UNION de-duping step. If b is removed in some arm, then it must be true that we get exactly one left-join output row per a row, regardless of the contents of b, in that arm. The argument for the patch being okay is essentially that we must get exactly one left-join output row per a row, regardless of the contents of b, in *every* arm, because the various modified versions of the OR clause can't affect that conclusion. In some of the arms we might not remove b, and we might even be able to reduce the left join to an inner join, but there should still be no more than one join output row per a row. That being the case, it should be sufficient to de-dup using a.ctid while ignoring b.ctid. The only case I can think of is: would it be possible (perhaps not today; maybe in the future) for other parts of the query to affect join elimination? I can't conceive of how that might happen, but if it did then it's possible that the elimination would work differently with the UNION than it would with an OR. The comment on join_is_removable() does mention that there's other potentially interesting cases that we can't handle now; it's maybe worth mentioning Other than that, I can't see any issues with your logic. Any clearer yet? Absolutely. I think it would be very valuable to include that with the initial comment in planunionor.c. Join reduction and removal is already tricky enough to wrap your head around. Other comments: + * is retty mechanical, but we can't do it until we have a RelOptInfo for the s/retty/pretty/ I suspect that in many systems single-table queries are far more common than CTEs, so maybe it's worth reversing those two tests in split_join_or_clauses(). For the Unique path, it would be nice if the code did what would be necessary to consider a TID hash join, since that means a user could create the appropriate operator and it would just be picked up. Certainly not worth much effort at this point though. + /* +* Must not have any volatile functions in FROM or WHERE (see notes at +* head of file). +*/ + if (contain_volatile_functions((Node *) parse->jointree)) Is there by chance anywhere else that needs to check that? Maybe worth adding the info to the Query struct if so. +* We insist that all baserels used in the query be plain relations, so Dumb question... views have already be decomposed at this point, right? Perhaps less dumb question... what happens if the original query already had setOps? AIUI setOps work has already been done by the time split_join_or_clauses() happens; I just want to check that that's OK. I'm not sure a GUC is worth it... I suspect that any query with multiple rels and an OR condition is going to be expensive enough that whatever additional plan time there is won't be noticeable. I've verified that the patch still applies and make check-world is clean. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PinBuffer() no longer makes use of strategy
On 3/16/17 12:48 PM, David Steele wrote: This patch looks pretty straight forward and applies cleanly and compiles at cccbdde. It's not a straight revert, though, so still seems to need review. Jim, do you know when you'll have a chance to look at that? Yes. Compiles and passes for me as well. One minor point: previously the code did if (buf->usage_count < BM_MAX_USAGE_COUNT) but now it does if (BUF_STATE_GET_USAGECOUNT(buf_state) != BM_MAX_USAGE_COUNT) being prone to paranoia, I prefer the first, but I've seen both styles in the code so I don't know if it's worth futzing with. Marked as RFC. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)
On 3/16/17 11:54 AM, David Steele wrote: On 2/14/17 4:03 PM, Tom Lane wrote: Jim Nasby <jim.na...@bluetreble.com> writes: On 2/14/17 1:18 PM, Tom Lane wrote: One point that could use further review is whether the de-duplication algorithm is actually correct. I'm only about 95% convinced by the argument I wrote in planunionor.c's header comment. I'll put some thought into it and see if I can find any holes. Are you only worried about the removal of "useless" rels or is there more? Well, the key point is whether it's really OK to de-dup on the basis of only the CTIDs that are not eliminated in any UNION arm. I was feeling fairly good about that until I thought of the full-join-to- left-join-to-no-join conversion issue mentioned in the comment. Now I'm wondering if there are other holes; or maybe I'm wrong about that one and it's not necessary to be afraid of full joins. This patch applies cleanly (with offsets) and compiles at cccbdde. Jim, have you had time to think about this? Any insights? Having thought about it, I share Tom's concerns. And now I'm worried about what happens if there are multiple separate OR clauses. I guess those would be handled by separate UNIONs? I'm also finding it a bit hard to follow the comment Tom mentioned. I'm pretty sure I understand what's going on until this part: The identical proof can be expected to apply + * in other arms, except in an arm that references that rel in its version + * of the OR clause. But in such an arm, we have effectively added a + * restriction clause to what is known in other arms, which means that the + * set of rows output by that rel can't increase compared to other arms. AIUI, this is describing a case something like this: SELECT child.blah FROM child LEFT JOIN parent USING(parent_id) WHERE child.foo AND (child.baz=1 or child.baz=2) given that parent.parent_id is unique. Except for these concerns, there would need to be a complex OR somewhere in here that sometimes referenced parent and sometimes didn't, such as WHERE child.foo AND (child.baz=1 OR parent.foo=3) But I'm not following the logic here (very possibly because I'm wrong about what I said above): + * Therefore the situation in such an arm must be that including the rel + * could result in either zero or one output row, rather than exactly one + * output row as in other arms. So we still don't need to consider it for + * de-duplication. I'm definitely not certain about the rest of it. Tom, could you expand the description some, especially with some examples? -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Split conditions on relations
I've got a customer that is running a pretty expensive function as part of a WHERE clause. With or without the function, the table the function references is the inner-most of a series of nested loops. Without the function things are very fast, but adding the function increases the cost of the index scan on that table by a factor of ~80x. It also falsely skews the row estimate further down, causing a bad shift to materialization in another part of the query, but that's a different problem. Wrapping the majority of the query in an OFFSET 0 with the function call on the outside makes things fast again. It'd be nice if function execution could be delayed to a higher level of a query based on the cost. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get the 'ctid' from a record type?
On 3/10/17 10:31 PM, Eric Ridge wrote: What about this? Is the tuple currently being evaluated (I suppose in the case of a sequential scan) available in the context of a function call? AFAIK that *very* specific case would work, because the executor would be handing you the raw tuple. Not a great bet to make though. Also, there should be a macro somewhere that will tell you whether you have a full tuple or not. You'd want to make sure to check that an throw an error if you weren't handed a full tuple. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] INSERT INTO arr2(array[1].d, array[2].d)
Over in [1], I was very surprised to discover $SUBJECT[2]. I looked in the docs, and they clearly indicate that INSERT accepts "column names". What's the best way to describe this? "column expression"? "field expression"? 1: https://www.postgresql.org/message-id/20170311005810.kuccp7t5t5jhe...@alap3.anarazel.de 2: CREATE TABLE arr(d int[]); CREATE TABLE arr2(arr arr) INSERT INTO arr2(arr[1].d, arr[2].d) VALUES(ARRAY[1,2],ARRAY[3,4]) RETURNING * ┌───┐ │ arr │ ├───┤ │ {"(\"{1,2}\")","(\"{3,4}\")"} │ └───┘ -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index usage for elem-contained-by-const-range clauses
On 3/10/17 8:29 AM, Alexander Korotkov wrote: That's cool idea. But I would say more. Sometimes it's useful to transform "intcol between x and y" into "intcol <@ 'x,y'::int4range". btree_gin supports "intcol between x and y" as overlap of "intcol >= x" and "intcol <= y". That is very inefficient. But it this clause would be transformed into "intcol <@ 'x,y'::int4range", btree_gin could handle this very efficient. That's certainly be nice as well, but IMHO it's outside the scope of this patch to accomplish that. BTW, while we're wishing for things... Something else that would be nice is if there was a way to do these kind of transforms without hacking the backend... Also, I noticed that patch haven't regression tests. BTW, those tests need to pay special attention to inclusive vs exclusive bounds. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/11/17 2:06 PM, Tom Lane wrote: Jim Nasby <jim.na...@openscg.com> writes: It's actually a lot harder to mess up providing a git repo link than manually submitting patches to the mailing list. Yeah, we've heard that proposal before. We're still not doing it though. Insisting on patches being actually submitted to the mailing list is important for archival and possibly legal reasons. If someone sends in a link to $random-repo, once that site goes away there's no way to determine exactly what was submitted. The full proposal was that the commitfest app have the ability to generate and post the patch for you, assuming that the smoke-test passes. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/10/17 6:06 PM, Peter Eisentraut wrote: On 3/10/17 19:00, Jim Nasby wrote: Maybe instead of having the commitfest app try and divine patches from the list it should be able to send patches to the list from a specified git repo/branch. Anyone that provides that info would have tests run automagically, patches sent, etc. Anyone who doesn't can just keep using the old process. Those people who know what they're doing will presumably run all those checks before they submit a patch. It's those people who send in patches that don't apply cleanly or fail the tests that would benefit from this system. But if they're that careless, then they also won't take care to use this particular system correctly. It's actually a lot harder to mess up providing a git repo link than manually submitting patches to the mailing list. For most patches, it's also a hell of a lot faster to just submit a repo URL rather than dealing with patch files. Having this also means that reviewers can focus more on what the patch is actually doing instead of mechanical crap best left to a machine. Of course, *you* work on changes that are far more complex than any newbie will, and it wouldn't surprise me if such a feature wouldn't help you or other senior hackers at all. But AFAICT it wouldn't get in your way either. It would remove yet another burden for new hackers. Anyway, this is well off topic for the original thread... -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/10/17 5:57 PM, Peter Eisentraut wrote: On 3/10/17 14:53, Jim Nasby wrote: The biggest win we'd get from something like Travis would be if the commitfest monitored for new patch files coming in for monitored threads and it created a new branch, applied the patches, and if they applied without error commit the branch and push to let Travis do it's thing. We wouldn't want that running in the main git repo, but it should be fine in a fork that's dedicated to that purpose. This has been discussed several times before, e.g., https://www.postgresql.org/message-id/54dd2413.8030...@gmx.net Maybe instead of having the commitfest app try and divine patches from the list it should be able to send patches to the list from a specified git repo/branch. Anyone that provides that info would have tests run automagically, patches sent, etc. Anyone who doesn't can just keep using the old process. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/10/17 2:18 PM, Magnus Hagander wrote: But if you can put together something that picks up the individual patches out of the mail threads in the CF app and keeps branch-tips in a git repo up-to-date with those, including feeding the results back into the app, then go for it :) Seems like an ideal project for someone not on -hackers... do we have a list of "How you can help Postgres besides hacking database code" anywhere? -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/10/17 2:05 PM, Magnus Hagander wrote: Travis specifically would not help us with this, due to the dependency on gifhub, but something that knows how to run "patch ... && configure && make && make check" in a container would. Who's updating https://github.com/postgres/postgres/ right now? Presumably that script would be the basis for this... I'm unsure what would be easiest -- have something drive a "throwaway github repo" off the data in the CF app and try to pull things from there, or to just spawn containers and run it directly without travis. I'd be a bit nervous about creating our own container solution and opening that to automatically deploying patches. Travis (and other tools) already have that problem solved (or at least if they get hacked it's on them to clean up and not us :) Plus it'd be a heck of a lot more work on our side to set all that stuff up. The bigger issue with those is the usual -- how do you handle patches that have dependencies on each other,because they're always going to show up as broken individually. I guess we could tell people doing those to just push a git branch on github and register that one in the CF app (which does have some very basic support for tracking that, but I doubt anybody uses it today). If people use git format-patch it should JustWork(tm). Specifying a specific repo is another option. Even if we can't make it work for really complicated patches it might still be a win. If the travis build failed, commitfest could notify the author. It could also rebase master into each branch on a daily basis so authors would know very quickly if something got committed that broke their patch. It could at least verify that the patch still applies, yes. If the rebase was pushed to github and travis was setup, travis would then test the changes as well. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE command progress checker
On 3/10/17 1:06 PM, Andres Freund wrote: Hi, On 2017-03-10 02:11:18 -0600, Jim Nasby wrote: Perhaps instead of adding more clutter to \dvS we could just have a SRF for now. I don't see that as clutter, it's useful information, and keeping it discoverable is good, not bad. If we keep adding status reporting commands at some point it's going to get unwieldy. Though, if they were in their own schema... At over 2800 rows currently, you're not going to notice one more addition to \dfS. I think it's hard to design a good SRF for this. Because the fields for different types of progress are different / empty, you can't just trivially return them as rows. You'd have to do some EAV like 'command, field_name1, field_value1, ...' type of thing - not particularly pretty / easy to use. Oh, I wasn't suggesting a single SRF for everything. Hopefully users will eventually figure out a good formula to drive a "progress bar" for each type of monitor, which is what you really want anyway (at least 99% of the time). If we got there we could have a single view that gave the % complete for every command that was providing feedback. If someone wanted details they could hit the individual SRF. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/10/17 1:09 PM, Peter Eisentraut wrote: On 3/10/17 03:27, Jim Nasby wrote: Perhaps https://travis-ci.org/ or something similar could be used for this. That avoids any issues about random code. That doesn't achieve any platform coverage, which is the main point here. I don't think platform coverage is the first thing to worry about with patches, nor with ongoing development. The biggest win we'd get from something like Travis would be if the commitfest monitored for new patch files coming in for monitored threads and it created a new branch, applied the patches, and if they applied without error commit the branch and push to let Travis do it's thing. We wouldn't want that running in the main git repo, but it should be fine in a fork that's dedicated to that purpose. If the travis build failed, commitfest could notify the author. It could also rebase master into each branch on a daily basis so authors would know very quickly if something got committed that broke their patch. Obviously that doesn't remove the need for manual testing or the buildfarm, but it would at least let everyone know that the patch passed a smoke test. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM
On 3/10/17 5:08 AM, Masahiko Sawada wrote: BTW, I think there's already a function that handles the pluralization for you. IIRC it's one of the things you can add to an ereport() call. What is the function name? A quick `git grep plural` shows errdetail_plural and errmsg_plural. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/7/17 9:52 PM, Magnus Hagander wrote: There have also on and off been discussions about building arbitrary patches as they are sent to the mailinglists. Doing that without any committer (or other trusted party) as a filter is a completely different challenge of course, given that it basically amounts to downloading and running random code off the internet. Perhaps https://travis-ci.org/ or something similar could be used for this. That avoids any issues about random code. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE command progress checker
On 3/6/17 12:49 AM, Michael Paquier wrote: On Sat, Mar 4, 2017 at 5:33 AM, David Steele <da...@pgmasters.net> wrote: I think the idea of a general progress view is very valuable and there are a ton of operations it could be used for: full table scans, index rebuilds, vacuum, copy, etc. However, I feel that this proposal is not flexible enough and comes too late in the release cycle to allow development into something that could be committed. Well, each command really has its own requirements in terms of data to store, so we either finish with a bunch of small tables that anyone could query and join as they wish or a somewhat unique table that is bloated with all the information, with a set of views on top of it to query all the information. For extensibility's sake of each command (for example imagine that REINDEX could be extended with a CONCURRENTLY option and multiple phases), I would think that having a table per command type would not be that bad. Well, the ideal scenario is that someone uses the raw data to come up with a good way to just provide ye olde 0-100% progress bar. At that point a single view would do the trick. Perhaps instead of adding more clutter to \dvS we could just have a SRF for now. At over 2800 rows currently, you're not going to notice one more addition to \dfS. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 3/8/17 9:34 AM, Andreas Karlsson wrote: Also, if by any chance you think (or use any software that thinks) that OIDs for system objects are a stable identifier, this will be the first case where that ceases to be true. If the system is shut down or crashes or the session is killed, you'll be left with stray objects with names that you've never typed into the system. I'm sure you're going to say "don't worry, none of that is any big deal" and maybe you're right. Hm, I cannot think of any real life scenario where this will be an issue based on my personal experience with PostgreSQL, but if you can think of one please provide it. I will try to ponder some more on this myself. The case I currently have is to allow tracking database objects similar to (but not the same) as how we track the objects that belong to an extension[1]. That currently depends on event triggers to keep names updated if they're changed, as well as making use of the reg* types. If an event trigger fired as part of the index rename (essentially treating it like an ALTER INDEX) then I should be able to work around that. The ultimate reason for doing this is to provide something similar to extensions (create a bunch of database objects that are all bound together), but also similar to classes in OO languages (so you can have multiple instances).[2] Admittedly, this is pretty off the beaten path and I certainly wouldn't hold up the patch because of it. I am hoping that it'd be fairly easy to fire an event trigger as if someone had just renamed the index. 1: https://github.com/decibel/object_reference 2: https://github.com/decibel/pg_classy -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
On 3/7/17 10:30 AM, Keith Fiske wrote: I'm all for this feature and had suggested it back in the original FWIW, I was working with a system just today that has an overflow partition. thread to add partitioning to 10. I agree that adding a new partition should not move any data out of the default. It's easy enough to set up +1 a monitor to watch for data existing in the default. Perhaps also adding a column to pg_partitioned_table that contains the oid of the default partition so it's easier to identify from a system catalog perspective and make that monitoring easier. I don't even see a need for it to fail I agree that there should be a way to identify the default partition. either and not quite sure how that would even work? If they can't add a necessary child due to data being in the default, how can they ever get it out? Yeah, was wondering that as well... -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM
On 3/6/17 8:34 PM, Masahiko Sawada wrote: I don't think it can say "1 frozen pages" because the number of skipped pages according to visibility map is always more than 32 (SKIP_PAGES_THRESHOLD). That's just an artifact of how the VM currently works. I'm not a fan of cross dependencies like that unless there's a pretty good reason. BTW, I think there's already a function that handles the pluralization for you. IIRC it's one of the things you can add to an ereport() call. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results
On 3/2/17 8:03 AM, Peter Eisentraut wrote: On 12/20/16 23:14, Jim Nasby wrote: I've been looking at the performance of SPI calls within plpython. There's a roughly 1.5x difference from equivalent python code just in pulling data out of the SPI tuplestore. Some of that is due to an inefficiency in how plpython is creating result dictionaries, but fixing that is ultimately a dead-end: if you're dealing with a lot of results in python, you want a tuple of arrays, not an array of tuples. There is nothing that requires us to materialize the results into an actual list of actual rows. We could wrap the SPI_tuptable into a Python object and implement __getitem__ or __iter__ to emulate sequence or mapping access. Would it be possible to have that just pull tuples directly from the executor? The overhead of populating the tuplestore just to drain it again can become quite significant, and AFAICT it's completely unnecessary. Unfortunately, I think adding support for that would be even more invasive, which is why I haven't attempted it. On the flip side, I believe that kind of an interface would be usable by plpgsql, whereas the DestReceiver approach is not (AFAICT). -- Jim Nasby, Chief Data Architect, OpenSCG -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 3/4/17 11:49 AM, Peter Eisentraut wrote: I wonder whether we should emphasize this change by assigning DO_REFRESH_MATVIEW a higher number, like 100? Since there wasn't any interest in that idea, I have committed Jim's patch as is. Thanks. Something else that seems somewhat useful would be to have the sort defined by an array of the ENUM values in the correct order, and then have the code do the mechanical map generation. I'm guessing the only reasonable way to make that work would be to have some kind of a last item indicator value, so you know how many values were in the ENUM. Maybe there's a better way to do that... -- Jim Nasby, Chief Data Architect, OpenSCG -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] objsubid vs subobjid
On 3/1/17 9:24 AM, Peter Eisentraut wrote: On 3/1/17 09:51, Alvaro Herrera wrote: Peter Eisentraut wrote: On 2/22/17 19:35, Jim Nasby wrote: pg_get_object_address() currently returns a field called subobjid, while pg_depend calls that objsubid. I'm guessing that wasn't on purpose (especially because internally the function uses objsubid), and it'd be nice to fix it. I'm in favor of changing it, but it could theoretically break someone's code. Yes, it was an oversight. +1 for changing. OK done. BTW, did you backpatch as well? The function was added in 9.5. Presumably we wouldn't normally do that, but if we think this is unused enough maybe it's worth it. -- Jim Nasby, Chief Data Architect, OpenSCG -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 2/28/17 9:42 PM, Jim Nasby wrote: I'll post a plpython patch that doesn't add the output format control. I've attached the results of that. Unfortunately the speed improvement is only 27% at this point (with 999 tuples). Presumably that's because it's constructing a brand new dictionary from scratch for each tuple. I found a couple bugs. New patches attached. -- Jim Nasby, Chief Data Architect, OpenSCG From 116b6a45b0146e42f1faa130d78e9362950c18c3 Mon Sep 17 00:00:00 2001 From: Jim Nasby <jim.na...@bluetreble.com> Date: Wed, 1 Mar 2017 15:45:51 -0600 Subject: [PATCH 1/2] Add SPI_execute_callback() and callback-based DestReceiver. Instead of placing results in a tuplestore, this method of execution uses the supplied callback when creating the Portal for a query. --- src/backend/executor/spi.c | 79 -- src/backend/tcop/dest.c| 11 +++ src/include/executor/spi.h | 4 +++ src/include/tcop/dest.h| 1 + 4 files changed, 85 insertions(+), 10 deletions(-) diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 55f97b14e6..ffeba679da 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan); static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, Snapshot snapshot, Snapshot crosscheck_snapshot, - bool read_only, bool fire_triggers, uint64 tcount); + bool read_only, bool fire_triggers, uint64 tcount, + DestReceiver *callback); static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes, Datum *Values, const char *Nulls); @@ -320,7 +321,34 @@ SPI_execute(const char *src, bool read_only, long tcount) res = _SPI_execute_plan(, NULL, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} +int +SPI_execute_callback(const char *src, bool read_only, long tcount, + DestReceiver *callback) +{ + _SPI_plan plan; + int res; + + if (src == NULL || tcount < 0) + return SPI_ERROR_ARGUMENT; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + memset(, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.cursor_options = 0; + + _SPI_prepare_oneshot_plan(src, ); + + res = _SPI_execute_plan(, NULL, + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -354,7 +382,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const char *Nulls, _SPI_convert_params(plan->nargs, plan->argtypes, Values, Nulls), InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +/* Execute a previously prepared plan with a callback Destination */ +int +SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls, +bool read_only, long tcount, DestReceiver *callback) +{ + int res; + + if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0) + return SPI_ERROR_ARGUMENT; + + if (plan->nargs > 0 && Values == NULL) + return SPI_ERROR_PARAM; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + res = _SPI_execute_plan(plan, + _SPI_convert_params(plan->nargs, plan->argtypes, + Values, Nulls), + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -383,7 +438,7 @@ SPI_execute_plan_with_paramlist(SPIPlanPtr plan, ParamListInfo params, res = _SPI_execute_plan(plan, params,
Re: [HACKERS] Two questions about Postgres parser
On 2/27/17 10:37 AM, Tom Lane wrote: 2. Implicit user defined type casts are not applied for COALESCE operator: That has nothing to do with whether the cast is user-defined. It has to do with not wanting to automatically unify types across type-category boundaries (in this case, numeric vs. composite categories). That's per step 4 here: https://www.postgresql.org/docs/devel/static/typeconv-union-case.html and it's not an easy thing to get rid of because if you're considering more than one type category then the heuristic about preferring "preferred types" breaks down --- how do you know which category's preferred type to prefer? FWIW, while working on a variant type I wished there was a way to preempt built-in type resolution when dealing with a particular type. I was specifically interested in function calls, which IIRC is handled by a single function and a helper. Exporting those two and providing a hook would have done the trick in my case. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 2/28/17 11:21 AM, Andreas Karlsson wrote: The only downside I can see to this approach is that we no logner will able to reindex catalog tables concurrently, but in return it should be easier to confirm that this approach can be made work. Another downside is any stored regclass fields will become invalid. Admittedly that's a pretty unusual use case, but it'd be nice if there was at least a way to let users fix things during the rename phase (perhaps via an event trigger). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disallowing multiple queries per PQexec()
On 2/28/17 2:45 PM, Andres Freund wrote: So if you don't want to allow multiple statements, use PQexecParams et al. That does leave most application authors out in the cold though, since they're using a higher level connection manager. If the maintenance burden isn't terribly high it would be nice to allow disabling multiple statements via a GUC. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] I propose killing PL/Tcl's "modules" infrastructure
On 2/27/17 2:42 PM, Tom Lane wrote: + SET pltcl.start_proc = 'no_such_function'; + select tcl_int4add(1, 2); + ERROR: function no_such_function() does not exist Can the error message be more explicit somehow? Otherwise people will be quite confused as to where no_such_function() is coming from. BTW, I'd think this functionality would be valuable for every PL. Maybe it's worth adding formal support for it to pg_language et all and leave it up to each language to decide whether it's supported or not? Multiple init functions might be useful too, similar to how we support multiple hook functions (though presumably a field of regproc[] is a better way to handle that...) I'm also wondering if there'd be value to supporting code that runs on each function invocation. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional
On 2/27/17 6:25 PM, David Steele wrote: The purpose of this patch is to make waiting for archive optional, with the default being the current behavior, i.e., to wait for all WAL to be archived. This functionality is already used internally by pg_basebackup, so the only real change is to expose it through the pg_stop_backup() function. Do the docs mention anywhere how to monitor WAL archiving to know if you've got all the necessary WAL? Perhaps a function to do that would be worth adding. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2017
On 2/27/17 4:52 PM, Thomas Munro wrote: By the way, that page claims that PostgreSQL runs on Irix and Tru64, which hasn't been true for a few years. There could be a GSoC project to add support for those back in... ;P -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 1/24/17 10:43 PM, Jim Nasby wrote: I strongly suggest making this design effort a separate thread, and focusing on the SPI improvements that give "free" no-user-action performance boosts here. Fair enough. I posted the SPI portion of that yesterday. That should be useful for pl/R and possibly pl/perl. pl/tcl could make use of it, but it would end up executing arbitrary tcl code in the middle of portal execution, which doesn't strike me as a great idea. Unfortunately, I don't think plpgsql could make much use of this for similar reasons. I'll post a plpython patch that doesn't add the output format control. I've attached the results of that. Unfortunately the speed improvement is only 27% at this point (with 999 tuples). Presumably that's because it's constructing a brand new dictionary from scratch for each tuple. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) >From 7ef3e944c1ee8266d70fafae080afc6beb492102 Mon Sep 17 00:00:00 2001 From: Jim Nasby <jim.na...@bluetreble.com> Date: Wed, 25 Jan 2017 12:57:40 -0600 Subject: [PATCH 1/2] Add SPI_execute_callback() and callback-based DestReceiver. Instead of placing results in a tuplestore, this method of execution uses the supplied callback when creating the Portal for a query. --- src/backend/executor/spi.c | 76 -- src/backend/tcop/dest.c| 11 +++ src/include/executor/spi.h | 4 +++ src/include/tcop/dest.h| 1 + 4 files changed, 83 insertions(+), 9 deletions(-) diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 55f97b14e6..d55e06509f 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan); static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, Snapshot snapshot, Snapshot crosscheck_snapshot, - bool read_only, bool fire_triggers, uint64 tcount); + bool read_only, bool fire_triggers, uint64 tcount, + DestReceiver *callback); static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes, Datum *Values, const char *Nulls); @@ -320,7 +321,34 @@ SPI_execute(const char *src, bool read_only, long tcount) res = _SPI_execute_plan(, NULL, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} +int +SPI_execute_callback(const char *src, bool read_only, long tcount, + DestReceiver *callback) +{ + _SPI_plan plan; + int res; + + if (src == NULL || tcount < 0) + return SPI_ERROR_ARGUMENT; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + memset(, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.cursor_options = 0; + + _SPI_prepare_oneshot_plan(src, ); + + res = _SPI_execute_plan(, NULL, + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -354,7 +382,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const char *Nulls, _SPI_convert_params(plan->nargs, plan->argtypes, Values, Nulls), InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +/* Execute a previously prepared plan with a callback Destination */ +int +SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls, +bool read_only, long tcount, DestReceiver *callback) +{ + int res; + + if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0) + return SPI_ERROR_ARGUMENT; + + if (plan->nargs > 0 && Values == NULL) + return SPI_ERROR_PARAM; + + res = _SPI_begin_call(true); + if (res < 0) +
Re: [HACKERS] increasing the default WAL segment size
On 2/24/17 6:30 AM, Kuntal Ghosh wrote: * You're considering any WAL file with a power of 2 as valid. Suppose, the correct WAL seg size is 64mb. For some reason, the server generated a 16mb invalid WAL file(maybe it crashed while creating the WAL file). Your code seems to treat this as a valid file which I think is incorrect. Do you agree with that? Detecting correct WAL size based on the size of a random WAL file seems like a really bad idea to me. I also don't see the reason for #2... or is that how initdb writes out the correct control file? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Temporal query processing with range types
On 2/24/17 6:40 AM, Peter Moser wrote: Do you think it is better to remove the syntax for ranges expressed in different columns? It's not that hard to construct a range type on-the-fly from 2 columns, so (without having looked at the patch or really followed the thread) I would think the answer is yes. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_output output of base64
On 2/24/17 7:44 AM, Kenneth Marshall wrote: Like David suggests, if you want compact, run it through lz4/gzip/lzop...for a much better size return. Speaking of which; any bytea where you care about this is likely to live in an already compressed state in toast. ISTM it would be valuable if we had a way to just spit out the raw compressed data (or a text-safe version of that), at least for COPY's purposes... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FYI: git worktrees as replacement for "rsync the CVSROOT"
On 2/24/17 10:24 AM, Tom Lane wrote: Andrew Dunstan <andrew.duns...@2ndquadrant.com> writes: On 02/24/2017 02:36 AM, Craig Ringer wrote: On 16 January 2017 at 05:01, Jim Nasby <jim.na...@bluetreble.com> wrote: git worktree add ../9.6 REL9_6_STABLE Does this do anythng different from the git contrib script git-new-workdir that I have been using for quite a long while? I think it's basically a more formally supported version of the contrib script. They may have fixed some of the hackier aspects of the contrib script --- I mind in particular the fact that you need to disable git's auto-gc activity when you use git-new-workdir, but I don't see any such restriction in the git-worktree man page. Haven't tried to switch over myself, but maybe I will at some point. One thing to be aware of that I discovered: you may not have 2 checkouts of the same branch, something that is possible with what's currently documented on the wiki. Since the only pain in the wiki workflow is setting up a new branch (which I've scripted, attached) I've pretty much given up on using worktrees. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) #!/bin/sh if [ $# -ne 1 ]; then echo Error exit 1 fi branch=REL`echo $1 | tr . _`_STABLE mkdir i/$1 git clone --reference postgresql.git -b $branch git://git.postgresql.org/git/postgresql.git $1 cd $1 ln -s ../i/$i i cd .git/info ln -sf ../../../git-info-exclude exclude -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GUC for cleanup indexes threshold.
On 2/24/17 11:26 AM, Robert Haas wrote: I think we need to come up with some set of tests to figure out what actually works well in practice here. Theories are a good starting point, but good vacuum behavior is really important, and a patch that changes it ought to be backed up by at least some experimental evidence. I think something else worth considering is that if we had some method of mapping heap TIDs back to indexes then a lot (all?) of these problems would go away. 10+ years ago the idea of keeping such a mapping would probably be untenable, but with resource forks and how much cheaper storage is maybe that's no longer the case. For btree I think this could be done by keeping a second btree ordered by ctid that points either to index entries or even just to whole index pages. At ~ 20 bytes per entry, even a 1B row index would take ~20GB. Page splits are obviously a big issue. Maybe it's safe to update the ctid map for every item that gets moved when a split happens. Would a ctid map work for other indexes as well? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] case_preservation_and_insensitivity = on
On 2/24/17 12:28 AM, Robert Haas wrote: On Thu, Feb 23, 2017 at 6:59 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: I think these are straw-man arguments, really. Consider the actual use case for such a feature: it's for porting some application that was not written against Postgres to begin with. I'm not sure that's totally true. I think at least some requests for this feature are intended at satisfying somebody's sense of aesthetics. If I had $1 for every time I had to chase someone away from using camelcase I'd be able to sponsor a key at the next conference. And honetly I'd actually like to be able to use camelcase and still get easy to read output from \d & co. IOW, this is definitely NOT driven just by porting efforts. I think the only reason we don't hear more requests about it is people (grudgingly) just muddle on without it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] case_preservation_and_insensitivity = on
On 2/24/17 11:34 AM, Joel Jacobson wrote: SELECT SomeCol, OtherCol, FooCol, BarCol, MyCol, ExtraCol, LastCol INTO _SomeCol, _OtherCol, _FooCol, _BarCol, _MyCol, _ExtraCol, _LastCol FROM Foo WHERE Bar = 'Baz'; This is to avoid typos that are then visually easy to spot, thanks to all chars being aligned. Why not just use a record or the table composite? I'll commonly do stuff like: DECLARE r record BEGIN SELECT INTO STRICT r blah, foo, bar, baz FROM pirate ; IF r.blah THEN RAISE 'Yaaar!' END IF; ... (Well, to be honest I always try to write pirate apps in plR... ;P) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 2/24/17 12:30 PM, Tomas Vondra wrote: In any case, we can't just build x86-64 packages with compile-time SSE4.1 checks. Dumb question... since we're already discussing llvm for the executor, would that potentially be an option here? AIUI that also opens the possibility of using the GPU as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog
On 2/23/17 9:01 PM, Michael Paquier wrote: An idea here would be to add in the long header of the segment a timestamp of when it was created. This is inherent to only the server generating the WAL. ISTM it'd be reasonable (maybe even wise) for WAL files to contain info about the first and last LSN, commit xid, timestamps, etc. That could be made performance wise with an archive command. With pg_receivexlog you could make use of the end-segment command to scan the completely written segment for this data before moving on to the next one. At least it gives an argument for having such a command. David Steele mentioned that he could make use of such a thing. BTW, I'm not opposed to an end-segment command; I'm just saying I don't think having it would really help users very much. Thanks. Yes that's hard to come up here with something that would satisfy enough users without giving much maintenance penalty. Yeah, I think it'd be a decent (though hopefully not huge) amount of work. As I see it, we got away for years with no replication, but eventually realized that we were really leaving a hole in our capabilities by not having built-in binary rep. I think we're nearing a similar point with handling PITR backups. People have written some great tools to help with this, but at some point (PG 11? 13?) there should probably be some strong included tools. I suspect that a huge improvement on the internal tools could be had for 1/2 or less the effort that's been spent on all the external ones. Of course, much of that is because the external tools have helped prove out what works and what doesn't. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idea on how to simplify comparing two sets
On 2/23/17 3:33 PM, Corey Huinker wrote: I've had to do it with temp tables any time the environment is different between control/experiment, which is the case when you're developing a drop-in replacement for an SRF or view or whatever. FWIW I'd recommend temp views, to give the planner the most latitude. Less load from (presumably) pointless copying too. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog
On 2/23/17 8:52 PM, Michael Paquier wrote: OK, I forgot a bit about this past discussion. So let's say that we have a command, why not also allow users to use at will a marker %f to indicate the file name just completed? One use case here is to scan the file for the oldest and/or newest timestamps of the segment just finished to do some retention policy with something else in charge of the cleanup. Why not provide % replacements that contain that info? pg_receivexlog has a much better shot at doing that correctly than some random user tool... (%f could certainly be useful for other things) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog
On 2/23/17 8:47 PM, Michael Paquier wrote: Anything else than measured in bytes either requires a lookup at the file timestamp, which is not reliable with noatime or a lookup at WAL itself to decide when is the commit timestamp that matches the oldest point in time of the backup policy. An indication that it'd be nice to have a better way to store this information as part of a base backup, or the archived WAL files. That could be made performance wise with an archive command. With pg_receivexlog you could make use of the end-segment command to scan the completely written segment for this data before moving on to the next one. At least it gives an argument for having such a command. David Steele mentioned that he could make use of such a thing. BTW, I'm not opposed to an end-segment command; I'm just saying I don't think having it would really help users very much. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_output output of base64
On 2/23/17 8:22 PM, Bruce Momjian wrote: I was just curious because it seems more compact than hex and many exchange formats use it, like SSL certificates and keys. I know you can encode() but I thought it might help make pg_dump output smaller. It undoubtedly would make pg_dump smaller, though I'm not sure how much that's worth since if you care at all about that you'll gzip it. But, the other thing it might do is speed up COPY, especially on input. Some performance tests of that might be interesting. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation improvements for partitioning
On 2/23/17 8:36 PM, Robert Haas wrote: We're 4 or 5 days away from the start of the last CommitFest. We have time to fix bugs and improve documentation and maybe tweak things here and there, but 3 and 4 are significant development projects. There isn't time to do that stuff right now and get it right. It might be possible to provide some temporary work-arounds for some of this, which would be nice. But I agree that there's definitely not enough time to implement *good* solutions to even just automatic index creation, let alone somehow handling uniqueness. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 1/23/17 9:23 PM, Jim Nasby wrote: I think the last step here is to figure out how to support switching between the current behavior and the "columnar" behavior of a dict of lists. I've thought more about this... instead of trying to switch from the current situation of 1 choice of how results are return to 2 choices, I think it'd be better to just expose the API that the new Destination type provides to SPI. Specifically, execute a python function during Portal startup, and a different function for receiving tuples. There'd be an optional 3rd function for Portal shutdown. The startup function would be handed details of the resultset it was about to receive, as a list that contained python tuples with the results of SPI_fname, _gettype, _gettypeid. This function would return a callback version number and a python object that would be kept in the DestReceiver. The receiver function would get the object created by the startup function, as well as a python tuple of the TupleTableSlot that had gone through type conversion. It would need to add the value to the object from the startup function. It would return true or false, just like a Portal receiver function does. The shutdown function would receive the object that's been passed around. It would be able to do any post-processing. Whatever it returned is what would be handed back to python as the results of the query. The version number returned by the startup function allows for future improvements to this facility. One idea there is allowing the startup function to control how Datums get mapped into python objects. In order to support all of this without breaking backwards compatibility or forking a new API, plpy.execute would accept a kwdict, to avoid conflicting with the arbitrary number of arguments that can currently be accepted. We'd look in the kwdict for a key called "portal_functions" pointing at a 2 or 3 element tuple of the startup, receive and shutdown functions. plpy would pre-define a tuple that provides the current behavior, and that's what would be used by default. In the future, we might add a way to control the default. Comments? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PinBuffer() no longer makes use of strategy
On 2/4/17 1:47 PM, Alexander Korotkov wrote: I'm tempted to put the old logic back, but more because this likely was unintentional, not because I think it's clearly better. +1 Yes, it was unintentional change. So we should put old logic back unless we have proof that this change make it better. Patch is attached. I tried to make some comments, but probably they are not enough. Added to CF: https://commitfest.postgresql.org/13/1029/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idea on how to simplify comparing two sets
On 2/7/17 12:03 PM, Tom Lane wrote: That said I'm not sure how much we want to go down this road on our own. It'd be nice to have when its needed but its not something that gets much visibility on these lists to suggest a large pent-up demand. Yeah, if this isn't in the standard and not in other databases either, that would seem to suggest that it's not a big requirement. FWIW I've found myself needing the precursor to this this (give me the full diff) at least a couple times, and it's quite a PITA on anything but a trivial relation. It's also not possible to make this easier via an SRF because you don't know in advance what the result set looks like. So the best I've ever come up with is a file that can be included in psql that depends on having set two psql variables to the names of relations that can be queried (and if you need more than a relation you need to create a temp view). I've wondered about the possibility of allowing PLs the ability to dynamically define their return type based on their arguments. That would allow for an SRF to handle this case, and would be significantly more flexible than trying to do that using pseudotypes. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add checklist item for psql completion to commitfest review
On 2/15/17 9:49 AM, Robert Haas wrote: On Tue, Feb 14, 2017 at 1:13 PM, Jim Nasby <jim.na...@bluetreble.com> wrote: After seeing Yet Another Missing Psql Tab Completion it occurred to me... why not add a checklist item to the commitfest review page? I realize most regular contributors don't use the form, but a fair number of people do. I don't see how it could hurt. Another possible idea is a git hook that checks to see if the psql completion code has been touched if any of the grammar has been. That could certainly trigger false positives so it'd need to be easy to over-ride, but AFAIK that could be done via a special phrase in the commit message. In the past, our usual policy has been that tab completion isn't a hard requirement for a patch implementing a new feature. It often gets added after the fact. I think that policy has worked fine, but it's not a bad thing for people to include tab completion in the original patch either, if they have the brain space for it. I've never messed with completion so I don't know how hard it is, but my impression is that it gets added after the fact not because of any intentional decisions but because people simply forget about it. ISTM it would be more efficient of community resources to deal with completion in the original patch, unless there's some reason not to. IOW, no, don't make it a hard requirement, but don't omit it simply through forgetfulness. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
On 2/19/17 5:27 AM, Robert Haas wrote: (1) a multi-batch hash join, (2) a nested loop, and (3) a merge join. (2) is easy to implement but will generate a ton of random I/O if the table is not resident in RAM. (3) is most suitable for very large tables but takes more work to code, and is also likely to be a lot slower for small tables than a hash or nestloop-based approach. As I understand it, #3 is already in place for validate_index(). I think you'd just need a different callback that checks the heap key. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog
On 2/23/17 10:10 AM, Magnus Hagander wrote: Wouldn't this one, along with some other scenarios, be better provided by the "run command at end of segment" function that we've talked about before? And then that external command could implement whatever aging logic would be appropriate for the environment? That kind of API lead to difficulties with archiving direct from the database, so I'm not sure it's the best way to go. ISTM what's really needed is a good way for users to handle retention for both WAL as well as base backups. A tool that did that would need to understand what WAL is required to safely restore a base backup. It should be possible for users to have a separate retention policy for just base backups as well as backups that support full PITR. You'd also need an easy way to deal with date ranges (so you can do things like "delete all backups more than 1 year old"). Perhaps a good starting point would be a tool that lets you list what base backups you have, what WAL those backups need, when the backups were taken, etc. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A typo in mcxt.c
On 2/23/17 6:38 AM, Thomas Munro wrote: I'm not so confident, but the "'tis" seems to me to be a typo of "it's". That is an archaic way of contracting the same words differently: Given the number of non-native English speakers we have, it's probably worth changing it... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] case_preservation_and_insensitivity = on
On 2/20/17 3:30 AM, Joel Jacobson wrote: Also, I think the --lowercase-uniqueness feature would be useful by itself even without the --case-preserving feature, since that might be a good way to enforce a good design of new databases, as a mix of "users" and "Users" is probably considered ugly by many system designers. FWIW, I don't think --lowercase-uniqueness is a good name. --case-insensitive-unique would be better. In addition to that, it'd be interesting to allow for a user-supplied name validation function that can throw an error if it sees something it doesn't like (such as a name that contains spaces, or one that's longer than NAMEDATALEN). I suspect it'd be pretty hard to add that though. BTW, keep in mind that what you're suggesting here means changing *every* catalog that contains a name field. A query against info_schema will show you that that's most of them. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supporting infrastructure
On 2/20/17 11:22 AM, David Christensen wrote: - If an entire cluster is going to be considered as checksummed, then even databases that don't allow connections would need to get enabled. Yeah, the workaround for now would be to require “datallowconn" to be set to ’t' for all databases before proceeding, unless there’s a way to connect to those databases internally that bypasses that check. Open to ideas, though for a first pass seems like the “datallowconn” approach is the least amount of work. The problem with ignoring datallowconn is any database where that's false is fair game for CREATE DATABASE. I think just enforcing that everything's connectable is good enough for now. I like the idea of revalidation, but I'd suggest leaving that off of the first pass. Yeah, agreed. It might be easier on a first pass to look at supporting per-database checksums (in this case, essentially treating shared catalogs as their own database). All normal backends do per-database stuff (such as setting current_database) during startup anyway. That doesn't really help for things like recovery and replication though. :/ And there's still the question of SLRUs (or are those not checksum'd today??). So you’re suggesting that the data_checksums GUC get set per-database context, so once it’s fully enabled in the specific database it treats it as in enforcing state, even if the rest of the cluster hasn’t completed? Hmm, might think on that a bit, but it seems pretty straightforward. Something like that, yeah. What issues do you see affecting replication and recovery specifically (other than the entire cluster not being complete)? Since the checksum changes are WAL logged, seems you be no worse the wear on a standby if you had to change things. I'm specifically worried about the entire cluster not being complete. That makes it harder for replicas to know what blocks they can and can't verify the checksum on. That *might* still be simpler than trying to handle converting the entire cluster in one shot. If it's not simpler I certainly wouldn't do it right now. BTW, it occurs to me that this is related to the problem we have with trying to make changes that break page binary compatibility. If we had a method for handling that it would probably be useful for enabling checksums as well. You'd essentially treat an un-checksum'd page as if it was an "old page version". The biggest problem there is dealing with the potential that the new page needs to be larger than the old one was, but maybe there's some useful progress to be had in this area before tackling the "page too small" problem. I agree it’s very similar; my issue is I don’t want to have to postpone handling a specific case for some future infrastructure. Yeah, I was just mentioning it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] objsubid vs subobjid
pg_get_object_address() currently returns a field called subobjid, while pg_depend calls that objsubid. I'm guessing that wasn't on purpose (especially because internally the function uses objsubid), and it'd be nice to fix it. Attached does that, as well as updating the input naming on the other functions for consistency. I stopped short of changing the instances of subobjid in the C code to reduce backpatch issues, but maybe that should be done too... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 05652e86c2..5233089d87 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3096,13 +3096,13 @@ DESCR("get transaction Id and commit timestamp of latest transaction commit"); DATA(insert OID = 3537 ( pg_describe_object PGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 25 "26 26 23" _null_ _null_ _null_ _null_ _null_ pg_describe_object _null_ _null_ _null_ )); DESCR("get identification of SQL object"); -DATA(insert OID = 3839 ( pg_identify_object PGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 2249 "26 26 23" "{26,26,23,25,25,25,25}" "{i,i,i,o,o,o,o}" "{classid,objid,subobjid,type,schema,name,identity}" _null_ _null_ pg_identify_object _null_ _null_ _null_ )); +DATA(insert OID = 3839 ( pg_identify_object PGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 2249 "26 26 23" "{26,26,23,25,25,25,25}" "{i,i,i,o,o,o,o}" "{classid,objid,objsubid,type,schema,name,identity}" _null_ _null_ pg_identify_object _null_ _null_ _null_ )); DESCR("get machine-parseable identification of SQL object"); -DATA(insert OID = 3382 ( pg_identify_object_as_address PGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 2249 "26 26 23" "{26,26,23,25,1009,1009}" "{i,i,i,o,o,o}" "{classid,objid,subobjid,type,object_names,object_args}" _null_ _null_ pg_identify_object_as_address _null_ _null_ _null_ )); +DATA(insert OID = 3382 ( pg_identify_object_as_address PGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 2249 "26 26 23" "{26,26,23,25,1009,1009}" "{i,i,i,o,o,o}" "{classid,objid,objsubid,type,object_names,object_args}" _null_ _null_ pg_identify_object_as_address _null_ _null_ _null_ )); DESCR("get identification of SQL object for pg_get_object_address()"); -DATA(insert OID = 3954 ( pg_get_object_addressPGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 2249 "25 1009 1009" "{25,1009,1009,26,26,23}" "{i,i,i,o,o,o}" "{type,name,args,classid,objid,subobjid}" _null_ _null_ pg_get_object_address _null_ _null_ _null_ )); +DATA(insert OID = 3954 ( pg_get_object_addressPGNSP PGUID 12 1 0 0 0 f f f f t f s s 3 0 2249 "25 1009 1009" "{25,1009,1009,26,26,23}" "{i,i,i,o,o,o}" "{type,name,args,classid,objid,objsubid}" _null_ _null_ pg_get_object_address _null_ _null_ _null_ )); DESCR("get OID-based object address from name/args arrays"); DATA(insert OID = 2079 ( pg_table_is_visible PGNSP PGUID 12 10 0 0 0 f f f f t f s s 1 0 16 "26" _null_ _null_ _null_ _null_ _null_ pg_table_is_visible _null_ _null_ _null_ )); diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index ec5ada97ad..08f9826c9e 100644 --- a/src/test/regress/expected/object_address.out +++ b/src/test/regress/expected/object_address.out @@ -401,14 +401,14 @@ WITH objects (type, name, args) AS (VALUES ('publication relation', '{addr_nsp, gentable}', '{addr_pub}'), ('subscription', '{addr_sub}', '{}') ) -SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*, +SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.objsubid)).*, -- test roundtrip through pg_identify_object_as_address - ROW(pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)) = - ROW(pg_identify_object(addr2.classid, addr2.objid, addr2.subobjid)) + ROW(pg_identify_object(addr1.classid, addr1.objid, addr1.objsubid)) = + ROW(pg_identify_object(addr2.classid, addr2.objid, addr2.objsubid)) FROM objects, pg_get_object_address(type, name, args) addr1, - pg_identify_object_as_address(classid, objid, subobjid) ioa(typ,nms,args), + pg_identify_object_as_address(classid, objid, objsubid) ioa(typ,nms,args), pg_ge
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 5:38 PM, Michael Banck wrote: diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index ea643397ba..708a47f3cb 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -26,6 +26,9 @@ static const char *modulename = gettext_noop("sorter"); * Sort priority for database object types. * Objects are sorted by type, and within a type by name. * + * Because materialized views can potentially reference system views, + * DO_REFRESH_MATVIEW should always be the last thing on the list. + * I think this comment is overly specific: any materialized view that references a view or table in a different schema (pg_catalog or not) will likely not refresh on pg_restore AIUI, so singling out system views doesn't look right to me. This isn't a matter of excluded schemas. The problem is that if you had a matview that referenced a system view for something that was restored after DO_REFRESH_MATVIEW (such as subscriptions) then the view would be inaccurate after the restore. Stephen, hopefully that answers your question as well. :) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump does not refresh matviews from extensions
On 2/21/17 2:05 PM, Stephen Frost wrote: As for $SUBJECT, I feel like it really depends, doesn't it? If the extension creates the matview w/ no data in it, and doesn't mark it as a config table, should we really refresh it? On the other hand, if the extension creates the matview and either refreshes it, or something else refreshes it later, then perhaps we should do so too, to get us back to the same state. I didn't think to test marking the matview as dumpable. If I do that then a refresh item does get created, and it's actually based on whether the view contains any data. We should at least document that. Now that I know that, I guess I'm kinda on the fence about doing it automatically, because AFAIK there'd be no way to override that automatic behavior. I can't really conceive of any reason you wouldn't want the refresh, but since it's not happening today... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 12:29 PM, Peter Eisentraut wrote: On 2/22/17 10:14, Jim Nasby wrote: CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; SELECT 0 IOW, you can create matviews that depend on any other table/view/matview, but right now if the matview includes certain items it will mysteriously end up empty post-restore. Yes, by that logic matview refresh should always be last. Patches for head attached. RLS was the first item added after DO_REFRESH_MATVIEW, which was added in 9.5. So if we want to treat this as a bug, they'd need to be patched as well, which is a simple matter of swapping 33 and 34. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index ea643397ba..708a47f3cb 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -26,6 +26,9 @@ static const char *modulename = gettext_noop("sorter"); * Sort priority for database object types. * Objects are sorted by type, and within a type by name. * + * Because materialized views can potentially reference system views, + * DO_REFRESH_MATVIEW should always be the last thing on the list. + * * NOTE: object-type priorities must match the section assignments made in * pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY, * POST_DATA objects must sort after DO_POST_DATA_BOUNDARY, and DATA objects @@ -70,11 +73,11 @@ static const int dbObjectTypePriority[] = 22, /* DO_PRE_DATA_BOUNDARY */ 26, /* DO_POST_DATA_BOUNDARY */ 33, /* DO_EVENT_TRIGGER */ - 34, /* DO_REFRESH_MATVIEW */ - 35, /* DO_POLICY */ - 36, /* DO_PUBLICATION */ - 37, /* DO_PUBLICATION_REL */ - 38 /* DO_SUBSCRIPTION */ + 38, /* DO_REFRESH_MATVIEW */ + 34, /* DO_POLICY */ + 35, /* DO_PUBLICATION */ + 36, /* DO_PUBLICATION_REL */ + 37 /* DO_SUBSCRIPTION */ }; static DumpId preDataBoundId; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication vs. float timestamps is a disaster
On 2/22/17 9:12 AM, Andres Freund wrote: That would allow an in-place upgrade of a really large cluster. A user would still need to modify their code to use the new type. Put another way: add ability for pg_upgrade to change the type of a field. There might be other uses for that as well. Type oids are unfortunately embedded into composite and array type data - we can do such changes for columns themselves, but it doesn't work if there's any array/composite members containing the to-be-changed type that are used as columns. Only in the catalog though, not the datums, right? I would think you could just change the oid in the catalog the same as you would for a table column. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mat views stats
On 2/22/17 7:56 AM, Peter Eisentraut wrote: What behavior would we like by default? Refreshing a materialized view is a pretty expensive operation, so I think scheduling an analyze quite aggressively right afterwards is often what you want. I think sending a stats message with the number of inserted rows could make sense. +1 on both counts. And if sane analyze behavior does depend on the stats changes then there's no real advantage to a separate patch. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GRANT EXECUTE ON FUNCTION foo() TO bar();
On 2/22/17 2:51 AM, Pavel Stehule wrote: The solution based on rights is elegant, but in this moment I cannot to see all possible impacts on performance - because it means new check for any call of any function. Maybe checking call stack can be good enough - I have not idea how often use case it it. I think the simple solution to that is not to use proacl for this purpose but to add an oidvector to pg_proc that is a list of allowed callers. If the vector is kept sorted then it's a simple binary search. BTW, I agree that this feature would be useful, as would PRIVATE, but they're two separate features. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 8:00 AM, Peter Eisentraut wrote: Actually, I think matviews really need to be the absolute last thing. What if you had a matview that referenced publications or subscriptions? I'm guessing that would be broken right now. I'm not sure what you have in mind here. Publications and subscriptions don't interact with materialized views, so the relative order doesn't really matter. CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; SELECT 0 IOW, you can create matviews that depend on any other table/view/matview, but right now if the matview includes certain items it will mysteriously end up empty post-restore. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication vs. float timestamps is a disaster
On 2/22/17 7:56 AM, Andres Freund wrote: On 2017-02-22 08:43:28 -0500, Tom Lane wrote: Andres Freund <and...@anarazel.de> writes: On 2017-02-22 00:10:35 -0600, Jim Nasby wrote: I wounder if a separate "floatstamp" data type might fit the bill there. It might not be completely seamless, but it would be binary compatible. I don't really see what'd that solve. Seems to me this is a different name for what I already tried in <27694.1487456...@sss.pgh.pa.us>. It would be much better than doing nothing, IMO, but it would still leave lots of opportunities for mistakes. It sounded more like Jim suggested a full blown SQL type, given that he replied to my concern about the possible need for a deprecation period due to pg_upgrade concerns. To be useful for that, we'd need a good chunk of magic, so all existing uses of timestamp[tz] are replaced with floattimestamp[tz], duplicate some code, add implicit casts, and accept that composites/arrays won't be fixed. That sounds like a fair amount of work to me, and we'd still have no way to remove the code without causing pain. Right, but I was thinking more in line with just providing the type (as an extension, perhaps not even in core) and making it possible for pg_upgrade to switch fields over to that type. That would allow an in-place upgrade of a really large cluster. A user would still need to modify their code to use the new type. Put another way: add ability for pg_upgrade to change the type of a field. There might be other uses for that as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication vs. float timestamps is a disaster
On 2/20/17 5:04 AM, Andres Freund wrote: On 2017-02-20 11:58:12 +0100, Petr Jelinek wrote: That being said, I did wonder myself if we should just deprecate float timestamps as well. I think we need a proper deprecation period for that, given that the conversion away will be painful for pg_upgrade using people with big clusters. So I think we should fix this regardless... :( I wounder if a separate "floatstamp" data type might fit the bill there. It might not be completely seamless, but it would be binary compatible. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication vs. float timestamps is a disaster
On 2/21/17 4:52 PM, James Cloos wrote: "TL" == Tom Lane <t...@sss.pgh.pa.us> writes: TL> The question to be asked is whether there is still anybody out there TL> using float timestamps. Gentoo's ebuild includes: $(use_enable !pg_legacytimestamp integer-datetimes) \ FWIW, last time I looked it was also an option in FreeBSD's ports, though I think it's defaulted to int since forever ago (like, 7.4 era). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/21/17 4:25 PM, Peter Eisentraut wrote: On 2/21/17 14:58, Jim Nasby wrote: AFAICT in older versions only object types that absolutely had to wait for DO_POST_DATA_BOUNDARY would do so. More recently though, objects are being added after that (presumably because it's easier than renumbering everything in dbObjectTypePriority). Is there any specific assignment that you have concerns about? Originally, no, but reviewing the list again I'm kindof wondering about DO_DEFAULT_ACL, especially since the acl code in pg_dump looks at defaults as part of what removes the need to explicitly dump permissions. I'm also wondering if DO_POLICY could potentially affect matviews? Actually, I think matviews really need to be the absolute last thing. What if you had a matview that referenced publications or subscriptions? I'm guessing that would be broken right now. Is this change a good or bad idea? Should there be an official guide for where new things go? The comment above dbObjectTypePriority explains it, doesn't it? Not really; it just makes reference to needing to be in-sync with pg_dump.c. My concern is that clearly people went to lengths in the past to put everything possible before DO_PRE_DATA_BOUNDARY (ie, text search and FDW) but most recently added stuff has gone after DO_POST_DATA_BOUNDARY, even though there's no reason it couldn't be pre-data. That's certainly a change, and I suspect it's not intentional (other than it's obviously less work to stick stuff at the end, but that could be fixed by having an array of the actual enum values and just having pg_dump sort that when it starts). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mat views stats
On 2/21/17 4:22 PM, Peter Eisentraut wrote: Attached is a patch to trigger autovacuum based on a matview refresh along with a system view pg_stat_all_matviews to show information more meaningful for materialized views. It might be easier to include materialized views into pg_stat_*_tables. Certainly easier, but I don't think it'd be better. Matviews really aren't the same thing as tables. Off-hand (without reviewing the patch), update and delete counts certainly wouldn't make any sense. "Insert" counts might, in as much as it's how many rows have been added by refreshes. You'd want a refresh count too. I think these should be two separate patches. We might want to backpatch the first one. +1; definitely sounds like a bug to me. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding new output parameter of pg_stat_statements to identify operation of the query.
On 2/19/17 10:02 PM, Tom Lane wrote: Jim Nasby <jim.na...@bluetreble.com> writes: Something that needs to be considered with doing this in pg_stat_statement is that a query that's reported there can contain multiple SQL statements. I don't remember offhand if all statements get parsed as a whole before anything else happens; if that's the case then you could potentially have an array in pg_stat_statements indicating what the command tags are. I think that's been addressed as of 83f2061dd. My own concern here is that pg_stat_statements shared hashtable entries (pgssEntry) are currently 200 bytes, if I counted correctly. It's hard to see how to implement this feature without adding COMPLETION_TAG_BUFSIZE (64 bytes) to that, which is kind of a large percentage bump for a feature request that AFAIR nobody else has ever made. AFAIK the only variable part of any tag is the rowcount from SELECT (if that's even part of the tag?)... so couldn't tags be switched over to an enum, at least internally? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Change in "policy" on dump ordering?
AFAICT in older versions only object types that absolutely had to wait for DO_POST_DATA_BOUNDARY would do so. More recently though, objects are being added after that (presumably because it's easier than renumbering everything in dbObjectTypePriority). Is this change a good or bad idea? Should there be an official guide for where new things go? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_dump does not refresh matviews from extensions
I think $SUBJECT is a bug. While it would be unusual for an extension to have a matview, it's still allowed, and as it stands right now that view won't be refreshed at the end of a restore, unlike other matviews. I started looking into a patch for this, but I'm not sure of the best way to handle it. One possibility is to glom this in with the code that handles extension config tables, but that doesn't feel right since matviews aren't really config. Would the best option be to change selectDumpableTable(), but I suspect that'd have to use the same "dumpobj" logic that processExtensionTables() uses. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GUC for cleanup indexes threshold.
On 2/19/17 7:56 PM, Masahiko Sawada wrote: The half-dead pages are never cleaned up if the ratio of pages containing garbage is always lower than threshold. Also in gin index the pending list is never cleared, which become big problem. I guess that we should take action for each type of indexes. What worries me is that each AM is going to have a different notion of what needs to happen to support this. That indicates that trying to handle this at the vacuum level is not a good idea. I think it would be wiser to add support for skipping scans to the AM API instead. That also means you don't have to add support for this to every index type to start with. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding new output parameter of pg_stat_statements to identify operation of the query.
On 2/19/17 6:34 PM, Tsunakawa, Takayuki wrote: We have done the job and are willing to post a patch. I sent one through my work mail, but it seems that my mail didn't reach the maillist, so I try again by using my personal mail account. A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is expected to be released this year. https://commitfest.postgresql.org/13/790/ Would this fit your need? If not, what's the benefit of getting the operation type via pg_stat_statements? Something that needs to be considered with doing this in pg_stat_statement is that a query that's reported there can contain multiple SQL statements. I don't remember offhand if all statements get parsed as a whole before anything else happens; if that's the case then you could potentially have an array in pg_stat_statements indicating what the command tags are. Short of that, I'm not sure it would be a good idea to only support a single tag being visible at a time; it would be certain to induce users to create code that's going to be buggy as soon as someone starts using multiple statements. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supporting infrastructure
On 2/19/17 11:02 AM, David Christensen wrote: My design notes for the patch were submitted to the list with little comment; see: https://www.postgresql.org/message-id/1E6E64E9-634B-43F4-8AA2-CD85AD92D2F8%40endpoint.com I have since added the WAL logging of checksum states, however I’d be glad to take feedback on the other proposed approaches (particularly the system catalog changes + the concept of a checksum cycle).] A couple notes: - AFAIK unlogged tables get checksummed today if checksums are enabled; the same should hold true if someone enables checksums on the whole cluster. - Shared relations should be handled as well; you don't mention them. - If an entire cluster is going to be considered as checksummed, then even databases that don't allow connections would need to get enabled. I like the idea of revalidation, but I'd suggest leaving that off of the first pass. It might be easier on a first pass to look at supporting per-database checksums (in this case, essentially treating shared catalogs as their own database). All normal backends do per-database stuff (such as setting current_database) during startup anyway. That doesn't really help for things like recovery and replication though. :/ And there's still the question of SLRUs (or are those not checksum'd today??). BTW, it occurs to me that this is related to the problem we have with trying to make changes that break page binary compatibility. If we had a method for handling that it would probably be useful for enabling checksums as well. You'd essentially treat an un-checksum'd page as if it was an "old page version". The biggest problem there is dealing with the potential that the new page needs to be larger than the old one was, but maybe there's some useful progress to be had in this area before tackling the "page too small" problem. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] case_preservation_and_insensitivity = on
On 2/19/17 4:51 PM, Joel Jacobson wrote: But once you've already decided to have a hard-and-fast rule that the names must be unique after lower-casing, there's no obvious benefit to rejecting queries that mention the same name with different case. Exactly, that trade-off is necessary, otherwise such queries would be ambiguous. I think a good general philosophy for the PostgreSQL project would be to try to look at how to meed the needs for new users of new projects in a way that don't impair things for existing users, by accepting the new users might have to live with some trade-offs for their new feature to be possible to implement, such as in this case that the trade-off is to not be able to create objects of different casing with the same lowercase names, a tradeoff that I personally think would not be a problem for most projects, since it seems unlikely you would both have a "users" table and a "Users" table in the same database. There's a serious problem with that, though: there certainly *could* be existing users that depend on the difference between "Users" and users, and there's no way we can just leave them out in the cold. Even if the project decided that "Users" and users is stupid and that we should deprecate it, I think the odds of also deciding to tell existing users to re-write their apps are zero. So no matter how this is designed, there has to be some way for existing users to be able to continue relying on "Users" and users being different. AFAICT that rules out any chance of this being a GUC, because you can't take a GUC into consideration when creating a unique index. What would work is an initdb option that controls this: when ignoring case for uniqueness is disabled, your new column would simply be left as NULL. With some extra effort you could probably allow changing that on a running database as well, just not with something as easy to change as a GUC. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_get_object_address() doesn't support composites
On 2/18/17 4:26 PM, Jim Nasby wrote: On 2/17/17 9:53 PM, Alvaro Herrera wrote: Jim Nasby wrote: See below. ISTM that pg_get_object_address should support everything pg_identify_object_as_address can output, no? I'm guessing the answer here is to have pg_identify_object_as_address complain if you ask it for something that's not mapable. Yes, I think we should just reject the case in pg_identify_object_as_address. Attached patch does that, and tests for it. Note that there were some unsupported types that were not being tested before. I've added a comment requesting people update the test if they add more types. While testing a view on pg_depend, I discovered this has the unfortunate side-effect of meaning you can no longer use pg_identify_object_as_address against pg_depend.ref*. Using it against pg_depend was already problematic though, because it throws an error on the pinned objects if you try and hand it classid, objid or objsubid. So maybe it's OK. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reporting xmin from VACUUMs
On 2/19/17 3:43 AM, Robert Haas wrote: This is the kind of information that you really want to see once per autovac, though, not just the most recent autovac or some kind of cumulative total. Knowing that I've done 301 index scans in my last 300 vacuums is not nearly as useful as knowing which autovacuum did 2 index scans and what exactly was going on with that vacuum. So I'm not sure including this sort of thing in the stats files would be very useful, or at least you'd want to think carefully about how to do it. Well, counters would be better than nothing I think, but I agree with your concern. Really, that's a problem for the entire stats system to varying degrees. As far as bloating the stats file is concerned, the big problem there is that our current design for the stats file requires rewriting the entire thing any time we want to update even a single byte of data. We could fix that by splitting up the files more so that they are smaller and faster to rewrite, but we could also fix it by coming up with a way of rewriting just one part of a file instead of the whole thing, or we could think about storing it in DSM so that you don't have to rewrite anything at all. I think that last option is worth some serious study now that we have DSA, but it's currently not very high on my personal priority list. Hmm... so basically replace the temporary file with DSM? Something else I think would be useful is a way to subscribe to stats updates. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Deprecate floating point timestamps.
Over in the "Keeping pg_recvlogical's "feTimestamp" separate from TimestampTz"... On 2/17/17 12:15 PM, Tom Lane wrote: > I am not sure that it was really a good design to pretend that the > replication protocol is independent of --disable-integer-datetimes > when the underlying WAL stream most certainly isn't. Ok, I'll open the can of worms... - Should replication be changed to obey --disable-integer-datetimes? - Should we consider formally deprecating FP timestamps, starting with no longer supporting SR? While #2 may sound rather severe, I'm wondering if a different datatype for timestamps stored as floats would ease that pain. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_get_object_address() doesn't support composites
On 2/17/17 9:53 PM, Alvaro Herrera wrote: Jim Nasby wrote: See below. ISTM that pg_get_object_address should support everything pg_identify_object_as_address can output, no? I'm guessing the answer here is to have pg_identify_object_as_address complain if you ask it for something that's not mapable. Yes, I think we should just reject the case in pg_identify_object_as_address. Attached patch does that, and tests for it. Note that there were some unsupported types that were not being tested before. I've added a comment requesting people update the test if they add more types. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c index 2a38792ed6..27ac6ca79a 100644 --- a/src/backend/catalog/objectaddress.c +++ b/src/backend/catalog/objectaddress.c @@ -488,7 +488,8 @@ static const ObjectPropertyType ObjectProperty[] = * do not have corresponding values in the output enum. The user of this map * must be careful to test for invalid values being returned. * - * To ease maintenance, this follows the order of getObjectTypeDescription. + * To ease maintenance, this follows the order of getObjectTypeDescription. If + * you add anything here please update test/regress/sql/object_address.sql. */ static const struct object_type_map { @@ -3634,6 +3635,7 @@ pg_identify_object_as_address(PG_FUNCTION_ARGS) Oid objid = PG_GETARG_OID(1); int32 subobjid = PG_GETARG_INT32(2); ObjectAddress address; + char *type; char *identity; List *names; List *args; @@ -3646,6 +3648,13 @@ pg_identify_object_as_address(PG_FUNCTION_ARGS) address.objectId = objid; address.objectSubId = subobjid; + /* Verify pg_get_object_address() would be able to do something with this type */ + type = getObjectTypeDescription(); + if (read_objtype_from_string(type) < 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("unsupported object type \"%s\"", type))); + /* * Construct a tuple descriptor for the result row. This must match this * function's pg_proc entry! @@ -3661,7 +3670,7 @@ pg_identify_object_as_address(PG_FUNCTION_ARGS) tupdesc = BlessTupleDesc(tupdesc); /* object type */ - values[0] = CStringGetTextDatum(getObjectTypeDescription()); + values[0] = CStringGetTextDatum(type); nulls[0] = false; /* object identity */ diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index ec5ada97ad..4e99068425 100644 --- a/src/test/regress/expected/object_address.out +++ b/src/test/regress/expected/object_address.out @@ -50,8 +50,9 @@ DO $$ DECLARE objtype text; BEGIN - FOR objtype IN VALUES ('toast table'), ('index column'), ('sequence column'), - ('toast table column'), ('view column'), ('materialized view column') + FOR objtype IN VALUES ('toast table'), ('composite type'), ('index column'), + ('sequence column'), ('toast table column'), ('view column'), + ('materialized view column'), ('composite type column') LOOP BEGIN PERFORM pg_get_object_address(objtype, '{one}', '{}'); @@ -62,11 +63,52 @@ BEGIN END; $$; WARNING: error for toast table: unsupported object type "toast table" +WARNING: error for composite type: unsupported object type "composite type" WARNING: error for index column: unsupported object type "index column" WARNING: error for sequence column: unsupported object type "sequence column" WARNING: error for toast table column: unsupported object type "toast table column" WARNING: error for view column: unsupported object type "view column" WARNING: error for materialized view column: unsupported object type "materialized view column" +WARNING: error for composite type column: unsupported object type "composite type column" +DO $$ +DECLARE + toastid oid; + classid oid; + objid oid; + objsubid int; + objtype text; +BEGIN + SELECT INTO STRICT toastid + reltoastrelid + FROM pg_class + WHERE oid = 'addr_nsp.gentable'::regclass + ; + FOR classid, objid, objsubid, objtype IN VALUES + (1259, toastid, 0, 'toast table'), + (1259, 'addr_nsp.gencomptype'::regclass, 0, 'composite type'), + (1259, 'addr_nsp.gentable_pkey'::regclass, 1, 'index column'), +
Re: [HACKERS] "SQL sentence"?
On 2/17/17 10:46 PM, Alvaro Herrera wrote: Sure. We have the extension that turned the command into JSON. It's still an unfinished patch, sadly, even though Alex Shulgin spent a lot of effort trying to get it finished. It is still missing a nontrivial amount of work, but within reach ISTM. You're speaking of https://www.postgresql.org/message-id/CACACo5Q_UXYwF117LBhjZ3xaMPyrgqnqE%3DmXvRhEfjJ51aCfwQ%40mail.gmail.com ? Can you reply to that to restart discussion? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "SQL sentence"?
On 2/17/17 10:46 PM, Alvaro Herrera wrote: Jim Nasby wrote: On 2/17/17 10:19 PM, Alvaro Herrera wrote: (FWIW, I'm wondering because I was just looking to see why there's no details for things like altering a column in a table.) Do you mean you want to have access to the details of the alter table operations being executed? There's no structured data for that; you need to write a C function to examine the pg_ddl_command opaque column. Yeah. It doesn't seem unreasonable for a user to want to get at that info. Sure. We have the extension that turned the command into JSON. It's still an unfinished patch, sadly, even though Alex Shulgin spent a lot of effort trying to get it finished. It is still missing a nontrivial amount of work, but within reach ISTM. I wonder if there's some generic way we could make C structs to JSON. IIRC there's other places where that ability would be handly. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_get_object_address() doesn't support composites
On 2/17/17 9:53 PM, Alvaro Herrera wrote: Another way to think about this problem is an approach Peter E suggested not long ago, which was to change the objname/objargs representation more completely. Hrm, I didn't see that. What was the idea? BTW, I do find it odd (and might eventually find it irritating) that some objname's squash schema and name into a single element. Not sure that's worth fixing at this point, though. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "SQL sentence"?
On 2/17/17 10:19 PM, Alvaro Herrera wrote: (FWIW, I'm wondering because I was just looking to see why there's no details for things like altering a column in a table.) Do you mean you want to have access to the details of the alter table operations being executed? There's no structured data for that; you need to write a C function to examine the pg_ddl_command opaque column. Yeah. It doesn't seem unreasonable for a user to want to get at that info. Could the opaque column be mapped to a composite? I guess that'd be a bit of a pain due to the union. :/ And I have a suspicion that alterTable.subcmds is a list of CollectedCommand, making things more fun. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Official adoption of PGXN
On 2/14/17 2:39 PM, Andres Freund wrote: One part of this would need to be having a designated committee of the Postgres community pick a set of "blessed" extensions for packagers to package. Right now, contrib serves that purpose (badly). One of the reasons we haven't dealt with the extension distribution problem is that nobody wanted to take on the issue of picking a list of blessed extensions. > I don't see the trust problem being solved by them being blessed unless they're part of the regularly scheduled postgres back-branch releases. Which essentially requires them to be in core, or increase the release maintenance/management cost further. I don't see why we'd have to touch blessed PGXN extensions any time there's a back-branch release. We don't do that with what's in core today. If we did want to do that (say, so that the extension version always matched the PG version), it wouldn't be hard to automate. Obviously if there's a bug in an extension we'd want to do something, but tying that to the release would be completely optional. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] "SQL sentence"?
I'm confused by this: "pg_event_trigger_ddl_commands returns one row for each base command executed; some commands that are a single SQL sentence may return more than one row." What is a "SQL sentence"? (FWIW, I'm wondering because I was just looking to see why there's no details for things like altering a column in a table.) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_get_object_address() doesn't support composites
See below. ISTM that pg_get_object_address should support everything pg_identify_object_as_address can output, no? I'm guessing the answer here is to have pg_identify_object_as_address complain if you ask it for something that's not mapable. ~@decina.local/5621# CREATE TYPE comp AS (a int, b int); CREATE TYPE ~@decina.local/5621# select * from pg_identify_object_as_address(1259,'comp'::regclass, 0); type | object_names | object_args +---+- composite type | {public,comp} | {} (1 row) ~@decina.local/5621# select * from pg_get_object_address('composite type', '{public,comp}', '{}'); ERROR: unsupported object type "composite type" ~@decina.local/5621# -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On 2/14/17 5:18 PM, Robert Haas wrote: On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote: I'd rather have a --quiet mode instead. If you're running it by hand, you're likely to omit the switch, whereas when writing the cron job you're going to notice lack of switch even before you let the job run once. Well, that might've been a better way to design it, but changing it now would break backward compatibility and I'm not really sure that's Meh... it's really only going to affect cronjobs or scripts, which are easy enough to fix, and you're not going to have that many of them (or if you do you certainly have an automated way to push the update). a good idea. Even if it is, it's a separate concern from whether or not in the less-quiet mode we should point out that we're waiting for a checkpoint on the server side. Well, --quite was suggested because of confusion from pg_basebackup twiddling it's thumbs... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Does having a NULL column automatically exclude the table from the tupleDesc cache?
On 2/15/17 1:37 PM, Ryan Murphy wrote: attcacheoff can only be set positive for fields preceding any varlena (typlen<0, but including the first such) or nullable values. I don't know how much faster it is with the cache; you can measure it if your curiosity is strong enough -- just set the first column to nullable. Thanks! Maybe I'll do some benchmarks. You'll probably want to do those at a C level, bypassing the executor. I would guess that executor overhead will completely swamp the effect of the cache in most cases. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers