[HACKERS] close_ps, NULLs, and DirectFunctionCall
(From IRC discussion with Andreas about some sqlsmith errors) Commit 278148907a9 changed close_ps in geo_ops.c to return SQL NULL in the event that a valid result point was not found (rather than crashing or failing an assert). But close_ps is called with DirectFunctionCall from other close_* functions, and indirectly from dist_* functions. So while the specific case of (point ## lseg) returns a NULL with no error in this case, for other data types, or for certain dist_* calls, rather than a null result you get this: postgres=# select point(1,2) ## '((0,0),(NaN,NaN))'::box; ERROR: function 0x9c5de0 returned NULL postgres=# select point(1,2) <-> '((0,0),(NaN,NaN))'::box; ERROR: function 0x9c5de0 returned NULL This seems ... unhelpful (though it's at least better than crashing) and inconsistent. Does this need fixing, and if so how? The most consistent fix would seem to be to make all the affected functions return NULL, which probably requires factoring out a bunch of close_*_internal functions and replacing the DirectFunctionCall usage. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inadequate infrastructure for NextValueExpr
> "Thomas" == Thomas Munrowrites: >> [...] >> T_NamedTuplestoreScan can be produced by outfuncs.c with tagname >> NAMEDTUPLESTORESCAN but that tagname is not recognized by readfuncs.c >> [...] >> >> That revealed a defect in commit >> 18ce3a4ab22d2984f8540ab480979c851dae5338 which I think should be >> corrected with something like the attached, though I'm not sure if >> it's possible to reach it. Thomas> Adding Kevin and Andrew G. Thoughts on whether this is a Thomas> defect that should be corrected with something like Thomas> read-namedtuplestorescan.patch? It's a defect that should probably be corrected for consistency, though at present it looks like it's not actually possible to reach the code. The patch looks good to me though I've not tested it. Kevin, you want to take it? Or shall I deal with it? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY vs. transition tables
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: >>>>> "Thomas" == Thomas Munro <thomas.mu...@enterprisedb.com> writes: Thomas> Here it is. Added to open items. Andrew> On it. Committed. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY vs. transition tables
> "Thomas" == Thomas Munrowrites: Thomas> Here it is. Added to open items. On it. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
Commits pushed. Unless I broke the buildfarm again (which I'll check up on later), or some new issue arises with the fixes, this should close all 3 related items for transition tables. -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
> "Noah" == Noah Mischwrites: Noah> IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is Noah> long past due for your status update. Please reacquaint yourself Noah> with the policy on open item ownership[1] and then reply Noah> immediately. If I do not hear from you by 2017-06-28 06:00 UTC, Noah> I will transfer this item to release management team ownership Noah> without further notice. Sorry for the lack of updates. I need to sleep now, but I will send a proper status update by 1800 UTC (1900 BST) today (28th). -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
> "Noah" == Noah Mischwrites: Noah> This PostgreSQL 10 open item is past due for your status update. Noah> Kindly send a status update within 24 hours, oops, sorry! I forgot to include a date in the last one, and in fact a personal matter delayed things anyway. I expect to have this wrapped up by 23:59 BST on the 24th. -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table
> "Thomas" == Thomas Munrowrites: Thomas> Thanks both for the review. New version of patch #2 attached. I'm looking to commit this soon; if anyone has any further comment now would be a good time to speak up. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transition tables vs ON CONFLICT
> "Thomas" == Thomas Munrowrites: Thomas> That accidentally removed a comment that I wanted to keep. Thomas> Here is a better version. I plan to commit this soon; if anyone has any comment to make, now would be a good time. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: Andrew> Unfortunately I've been delayed over the past couple of days, Andrew> but I have Thomas' latest patchset in hand and will be working Andrew> on it over the rest of the week. Status update by 23:59 BST on Andrew> Sun 18th, by which time I hope to have everything finalized Andrew> (all three issues, not just the inheritance one). I have, I believe, completed my review of the patchset. My conclusion is that the fix appears to be sound and I haven't been able to find any further issues with it; so I think Thomas's patches should be committed as-is. Unless anyone objects I will do this within the next few days. (Any preferences for whether it should be one commit or 3 separate ones?) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: Andrew> I will post a further status update before 23:59 BST on 14th Andrew> Jun. Unfortunately I've been delayed over the past couple of days, but I have Thomas' latest patchset in hand and will be working on it over the rest of the week. Status update by 23:59 BST on Sun 18th, by which time I hope to have everything finalized (all three issues, not just the inheritance one). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: Andrew> I have it; I will post a status update before 23:59 BST on 11 Andrew> Jun. This is that status update. I am still studying Thomas' latest patch set; as I mentioned in another message, I've confirmed a memory leak, and I expect further work may be needed in some other areas as well, but I think we're still making progress towards fixing it and I will work with Thomas on it. I will post a further status update before 23:59 BST on 14th Jun. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table
> "Robert" == Robert Haaswrites: Robert> I don't see a reason why MakeTransitionCaptureState needs to Robert> force the tuplestores into TopTransactionContext or make them Robert> owned by TopTransactionResourceOwner. Nor do I, and I'm pretty sure it's leaking memory wholesale within a transaction if you have aborted subxacts. e.g. a few iterations of this, on a table with an appropriate trigger: savepoint s1; insert into foo select i, case when i < 10 then repeat('a',100) else repeat('a',1/(i-10)) end from generate_series(1,10) i; rollback to s1; This is a bit contrived of course but it shows that there's missing cleanup somewhere, either in the form of poor choice of context or missing code in the subxact cleanup. Robert> I mean, it was like that before, but afterTriggers is a global Robert> variable and, potentially, there could still be a pointer Robert> accessible through it to a tuplestore linked from it even after Robert> the corresponding subtransaction aborted, possibly causing some Robert> cleanup code to trip and fall over. But that can't be used to Robert> justify this case, because the TransitionCaptureState is only Robert> reached through the PlanState tree; if that goes away, how is Robert> anybody going to accidentally follow a pointer to the Robert> now-absent tuplestore? For per-row triggers with transition tables, a pointer to the transition capture state ends up in the shared-data record in the event queue? -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transition table behavior with inheritance appears broken
> "Robert" == Robert Haaswrites: Robert> So, Andrew, are you running with this, or should I keep looking Robert> into it? I have it; I will post a status update before 23:59 BST on 11 Jun. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table
> "Robert" == Robert Haaswrites: Robert> unless some other committer volunteers. (Of course, anyone Robert> could step in to do the work, as Thomas already has to a Robert> considerable degree, but without a committer involved it Robert> doesn't fix the problem.) I can probably take this on if needed. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table
> "Thomas" == Thomas Munrowrites: Thomas> So, afterTriggers.query_stack is used to handle the reentrancy Thomas> that results from triggers running further statements that Thomas> might fire triggers. It isn't used for dealing with extra Thomas> ModifyTable nodes that can appear in a plan because of wCTEs. Thomas> Could it also be used for that purpose? I think that would Thomas> only work if it is the case that each ModifyTable node begin Thomas> and then runs to completion (ie no interleaving of wCTE Thomas> execution) and then its AS trigger fires, which I'm not sure Thomas> about. There's a problem with this which I didn't see anyone mention (though I've not read the whole history); existing users of wCTEs rely on the fact that no AFTER triggers are run until _all_ modifications are completed. If you change that, then you can't use wCTEs to insert into tables with FK checks without knowing the order in which the individual wCTEs are executed, which is currently a bit vague and non-obvious (which doesn't cause issues at the moment only because nothing actually depends on the order). for example: create table foo (id integer primary key); create table foo_bar (foo_id integer references foo, bar_id integer); with i1 as (insert into foo values (1)) insert into foo_bar values (1,2),(1,3); This would fail the FK check if each insert did its own trigger firing, since the foo_bar insert is actually run _first_. Firing triggers earlier than they currently are would thus break existing working queries. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure in REL9_5_STABLE
> "Pavan" == Pavan Deolaseewrites: Pavan> I am attaching a patch that throws a similar ERROR during Pavan> planning even for 9.5. AFAICS in presence of grouping sets, we Pavan> always decide to use sort-based implementation for grouping, but Pavan> do not check if the columns support ordering or not. Hmmm. This is obviously my fault somewhere... the patch looks good, I'll deal with it shortly. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans
> "Alexander" == Alexander Kuzmenkovwrites: Alexander> Structurally, the patch consists of two major parts: a Alexander> specialized executor node Why? It strikes me that the significant fact here is not that we're doing count(*), but that we don't need any columns from the bitmap heap scan result. Rather than creating a whole new node, can't the existing bitmap heapscan be taught to skip fetching the actual table page in cases where it's all-visible, not lossy, and no columns are needed? (this would also have the advantage of getting parallelism for free) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ongoing issues with representation of empty arrays
> "Tom" == Tom Lanewrites: >> First is contrib/intarray, _AGAIN_ (see past bugs such as #7730): >> ... >> I plan to fix this one properly, unless anyone has any objections. Tom> Just to clarify, what do you think is "properly"? I would say, that any time an intarray function returns an empty result it should be the standard 0-dimensional representation that every other array operation uses. The intarray functions all seem already able to take such values as inputs. Also there should be regression tests for this (none of intarray's existing tests have any empty arrays at all). >> Second is aclitem[], past bug #8395 which was not really resolved; empty >> ACLs are actually 1-dim arrays of length 0, and all the ACL functions >> insist on that, which means that you can't call aclexplode('{}') for >> example: >> It's much less clear what to do about this one. Thoughts? Tom> My instinct is to be conservative in what you send and liberal in Tom> what you accept. In this context that would probably mean fixing Tom> aclitem-related functions to accept both 0-dimensional and Tom> 1-dimensional-0-length inputs. Tom> (Actually, is there a reason to be picky about the input Tom> dimensionality at all, rather than just iterating over whatever Tom> the array contents are?) Currently there is this: #define ACL_NUM(ACL)(ARR_DIMS(ACL)[0]) which is obviously wrong for dimensions other than exactly 1. I don't _think_ there's any great obstacle to fixing this; the only code that would care about number of dimensions would be allocacl, and since there'd be no obvious reason to preserve the shape of an aclitem[] anywhere, that could just do 0 or 1 dimensions. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Ongoing issues with representation of empty arrays
The distinction between the standard representation of '{}' as an array with zero dimensions and nonstandard representations as a 1-dimensional array with zero elements has come up in a couple of contexts on the IRC channel recently. First is contrib/intarray, _AGAIN_ (see past bugs such as #7730): select array_dims(('{1,2}'::integer[] & '{3}')); array_dims [1:0] (1 row) regression=# select ('{1,2}'::integer[] & '{3}') = '{}'; ?column? -- f (1 row) Worse, the fact that the fix for #7730 (commit c155f654) only did a very partial job means that it's now inconsistent: regression=# select (a - b), (a & c), (a - b) = (a & c) from (values (array[1,2],array[1,2],array[3])) v(a,b,c); ?column? | ?column? | ?column? --+--+-- {} | {} | f (1 row) I plan to fix this one properly, unless anyone has any objections. Second is aclitem[], past bug #8395 which was not really resolved; empty ACLs are actually 1-dim arrays of length 0, and all the ACL functions insist on that, which means that you can't call aclexplode('{}') for example: https://www.postgresql.org/message-id/flat/CA%2BTgmoZdDpTJDUVsgzRhoCctidUqLDyO8bdYwgLD5p8DwHtMcQ%40mail.gmail.com It's much less clear what to do about this one. Thoughts? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Planner crash on foreign table join
> "Thomas" == Thomas Munrowrites: >> SomeType *x = (SomeType *) lfirst(l); >> >> (in my code I tend to omit the (SomeType *), which I dislike because >> it adds no real protection) Thomas> Just BTW, without that cast it's not compilable as C++, so I'm Thomas> guessing that Peter E will finish up putting it back in Thomas> wherever you leave it out... There's north of 150 other examples (just grep for '= lfirst' in the source). Some were even committed by Peter E :-) In the discussion with Andres the same point came up for palloc, for which I suggested we add something along the lines of: #define palloc_object(_type_) (_type_ *) palloc(sizeof(_type_)) #define palloc_array(_type_, n) (_type_ *) palloc((n) * sizeof(_type_)) palloc() without a cast is even more common than lfirst() without one, and something like half of those (and 80%+ of the pallocs that do have a cast) are palloc(sizeof(...)) or palloc(something * sizeof(...)). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Malformed Array Literal in PL/pgSQL Exception Block
> "David" == David E Wheelerwrites: >> If you change this to EXCEPTION WHEN division_by_zero THEN, the >> reported error becomes: >> >> ERROR: malformed array literal: "foo" >> LINE 1: SELECT things || 'foo' David> So the issue stands, yes? Tom's response has the explanation of why it fails (everywhere, not just in the exception block): parse analysis prefers to match the (array || array) form of the operator when given input of (array || unknown). Just cast the 'foo' to the array element type. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Malformed Array Literal in PL/pgSQL Exception Block
> "David" == David E Wheelerwrites: David> And it works great, including in PL/pgSQL functions, except in David> an exception block. When I run this: David> BEGIN; David> CREATE OR REPLACE FUNCTION foo( David> ) RETURNS BOOLEAN IMMUTABLE LANGUAGE PLPGSQL AS $$ David> DECLARE David> things TEXT[] := '{}'; David> BEGIN David> things := things || 'foo'; David> RAISE division_by_zero; This "raise" statement is not reached, because the previous line raises the "malformed array literal" error. David> EXCEPTION WHEN OTHERS THEN If you change this to EXCEPTION WHEN division_by_zero THEN, the reported error becomes: ERROR: malformed array literal: "foo" LINE 1: SELECT things || 'foo' -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Planner crash on foreign table join
> "Tom" == Tom Lanewrites: Tom> Experimentation shows that actually, the standard regression tests Tom> provide dozens of opportunities for find_relation_from_clauses to Tom> fail on non-RestrictInfo input. However, it lacks any IsA check, In a discussion with Andres on the hash grouping sets review thread, I proposed that we should have something of the form #define lfirst_node(_type_, l) (castNode(_type_,lfirst(l))) to replace the current idiom of foreach(l, blah) { SomeType *x = (SomeType *) lfirst(l); (in my code I tend to omit the (SomeType *), which I dislike because it adds no real protection) by foreach(l, blah) { SomeType *x = lfirst_node(SomeType, l); in order to get that IsA check in there in a convenient way. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Planner crash on foreign table join
> "Andreas" == Andreas Seltenreichwrites: Andreas> Hi, Andreas> testing master at f0e44021df with a loopback postgres_fdw Andreas> installed, I see lots of crashes on queries joining foreign Andreas> tables with various expressions. Below is a reduced recipe Andreas> for the regression database and a backtrace. Commit ac2b095088 assumes that clauselist_selectivity is being passed a list of RelOptInfo, but postgres_fdw is passing it a list of bare clauses. One of them is wrong :-) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Instead of DROP function use UPDATE pg_proc in an upgrade extension script
> "Vicky" == Vicky Vergarawrites: Vicky> UPDATE pg_proc SET [...] Vicky> So, I want to know how "safe" can you consider the second Vicky> method, and what kind of other objects do I need to test besides Vicky> views. Speaking from personal experience (I did this in the upgrade script for ip4r, in a much simpler case than yours, and broke it badly), it's not at all safe. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unable to build doc on latest head
> "Peter" == Peter Eisentrautwrites: > On 4/3/17 02:44, Ashutosh Bapat wrote: >> [1] says that id.attribute is supported in stylesheets version >> 1.77.1. Do I need to update stylesheets version? How do I do it? >> Any help will be appreciated. Peter> The oldest version among the ones listed at Peter> http://docbook.sourceforge.net/release/xsl/ that works for me is Peter> 1.77.0. Which one do you have? BTW, the version in freebsd ports is 1.76.1, and the download stats on docbook.sourceforge.net suggests that this one is still pretty widely used despite its age. I filed a PR on the port, but who knows when anything will happen. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Andres" == Andres Freundwrites: Andres> a) cast result of lfirst/lnext/whatnot. Again, what we need here is something like #define lfirst_node(_type_, l) (castNode(_type_, lfirst(l))) etc. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Andres" == Andres Freundwrites: Andres> We usually cast the result of palloc. >> Rough count in the backend has ~400 without casts to ~1350 with, so >> this doesn't seem to have been consistently enforced. Andres> Yea, but we're still trying. Well, a lot of the uncasted ones are in fairly new code, from quite a number of different committers. So if this is a big deal, why don't we already have #define palloc_array(etype,ecount) (((etype) *) palloc((ecount) * sizeof(etype))) #define palloc_object(otype) (((otype) *) palloc(sizeof(otype))) or something of that ilk? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Mark" == Mark Dilgerwrites: Mark> Is there a performance test case where this patch should shine Mark> brightest? I'd like to load a schema with lots of data, and run Mark> a grouping sets query, both before and after applying the patch, Mark> to see what the performance advantage is. The area which I think is most important for performance is the handling of small cubes; without this patch, a 2d cube needs 2 full sorts, a 3d one needs 3, and a 4d one needs 6. In many real-world data sets these would all hash entirely in memory. So here's a very simple example (deliberately using integers for grouping to minimize the advantage; grouping by text columns in a non-C locale would show a much greater speedup for the patch): create table sales ( id serial, product_id integer, store_id integer, customer_id integer, qty integer); -- random integer function create function d(integer) returns integer language sql as $f$ select floor(random()*$1)::integer + 1; $f$; -- 10 million random rows insert into sales (product_id,store_id,customer_id,qty) select d(20), d(6), d(10), d(100) from generate_series(1,1000); -- example 2d cube: select product_id, store_id, count(*), sum(qty) from sales group by cube(product_id, store_id); -- example 3d cube: select product_id, store_id, customer_id, count(*), sum(qty) from sales group by cube(product_id, store_id, customer_id); -- example 4d cube with a computed column: select product_id, store_id, customer_id, (qty / 10), count(*), sum(qty) from sales group by cube(product_id, store_id, customer_id, (qty / 10)); On my machine, the 2d cube is about 3.6 seconds with the patch, and about 8 seconds without it; the 4d is about 18 seconds with the patch and about 32 seconds without it (all with work_mem=1GB, compiled with -O2 and assertions off). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Mark" == Mark Dilgerwrites: Mark> You define DiscreteKnapsack to take integer weights and double Mark> values, and perform the usual Dynamic Programming algorithm to Mark> solve. But the only place you call this, you pass in NULL for Mark> the values, indicating that all the values are equal. I'm Mark> confused why in this degenerate case you bother with the DP at Mark> all. Can't you just load the knapsack from lightest to heaviest Mark> after sorting, reducing the runtime complexity to O(nlogn)? It's Mark> been a long day. Sorry if I am missing something obvious. That's actually a very good question. (Though in passing I should point out that the runtime cost is going to be negligible in all practical cases. Even an 8-way cube (256 grouping sets) has only 70 rollups, and a 4-way cube has only 6; the limit of 4096 grouping sets was only chosen because it was a nice round number that was larger than comparable limits in other database products. Other stuff in the grouping sets code has worse time bounds; the bipartite-match code used to minimize the number of rollups is potentially O(n^2.5) in the number of grouping sets.) Thinking about it, it's not at all obvious what we should be optimizing for here in the absence of accurate sort costs. Right now what matters is saving as many sort steps as possible, since that's a saving likely to be many orders of magnitude greater than the differences between two sorts. The one heuristic that might be useful is to prefer the smaller estimated size if other factors are equal, just on memory usage grounds, but even that seems a bit tenuous. I'm inclined to leave things as they are in the current patch, and maybe revisit it during beta if we get any relevant feedback from people actually using grouping sets? Mark> I'm guessing you intend to extend the code at some later date to Mark> have different values for items. Is that right? Well, as I mentioned in a reply to Andres, we probably should eventually optimize for greatest reduction in cost, and the code as it stands would allow that to be added easily, but that would require having more accurate cost info than we have now. cost_sort doesn't take into consideration the number or types of sort columns or the costs of their comparison functions, so all sorts of the same input data are costed equally. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Andres" == Andres Freundwrites: >> - Assert(newphase == 0 || newphase == aggstate->current_phase + 1); >> + Assert(newphase <= 1 || newphase == aggstate->current_phase + 1); Andres> I think this somewhere in the file header needs an expanded Andres> explanation about what these "special" phases mean. I could move most of the block comment for ExecInitAgg up into the file header; would that be better? Andres> I've not yet read up on the thread, or the whole patch, but an Andres> explanation of the memory management for all those tables would Andres> be good somewhere (maybe I've just not read that far). I can expand on that. Andres> "mixed types" sounds a bit weird. changing to "if there are both types present" >> + values = palloc((1 + max_weight) * sizeof(double)); >> + sets = palloc((1 + max_weight) * sizeof(Bitmapset *)); Andres> We usually cast the result of palloc. Rough count in the backend has ~400 without casts to ~1350 with, so this doesn't seem to have been consistently enforced. >> +static void >> +_outGroupingSetData(StringInfo str, const GroupingSetData *node) >> +{ >> + WRITE_NODE_TYPE("GSDATA"); >> + >> + WRITE_NODE_FIELD(set); >> + WRITE_FLOAT_FIELD(numGroups, "%.0f"); >> +} Andres> .0f? Copied from every other node that uses "%.0f" to write out a numGroups value. >> +static grouping_sets_data * >> +preprocess_grouping_sets(PlannerInfo *root) >> +{ Andres> It'd not hurt to add a comment as to what this function is Andres> actually doing. Sure. Andres> I hope there's tests for both these branches. A number of tests for both unsortable and unhashable cases are in the regression test. >> + /* >> +* Is it hashable? We pretend empty sets are hashable even >> though we >> +* actually force them not to be hashed later. But don't bother >> if >> +* there's nothing but empty sets. >> +*/ Andres> Why? If there's no non-empty grouping set then there's nothing to hash (the hash code assumes at least one column). I'll put that in the comment. >> @@ -3214,6 +3407,11 @@ get_number_of_groups(PlannerInfo *root, >> * estimate_hashagg_tablesize >> * estimate the number of bytes that a hash aggregate hashtable will >> * require based on the agg_costs, path width and dNumGroups. >> + * >> + * XXX this may be over-estimating the size now that hashagg knows to omit >> + * unneeded columns from the hashtable. Also for mixed-mode grouping sets, >> + * grouping columns not in the hashed set are counted here even though >> hashagg >> + * won't store them. Is this a problem? >> */ Andres> Hm. This seems like it could move us to use sorting, even if Andres> not actually warranted. This is largely a pre-existing issue that this patch doesn't try and fix. That would be an issue for a separate patch, I think. I merely added the comment to point it out. Andres> What's the newline after the comment about? Old style habits die hard. >> + RelOptInfo *grouped_rel, >> + Path *path, >> + bool is_sorted, >> + bool can_hash, >> + PathTarget *target, >> + grouping_sets_data *gd, >> + const AggClauseCosts >> *agg_costs, >> + double dNumGroups) >> +{ >> + Query *parse = root->parse; >> + >> + /* >> +* If we're not being offered sorted input, then only consider plans >> that >> +* can be done entirely by hashing. >> +* >> +* We can hash everything if it looks like it'll fit in work_mem. But if >> +* the input is actually sorted despite not being advertised as such, we >> +* prefer to make use of that in order to use less memory. >> +* If none of the grouping sets are sortable, then ignore the work_mem >> +* limit and generate a path anyway, since otherwise we'll just fail. >> +*/ >> + if (!is_sorted) >> + { >> + List *new_rollups = NIL; >> + RollupData *unhashable_rollup = NULL; >> + List *sets_data; >> + List *empty_sets_data = NIL; >> + List *empty_sets = NIL; >> + ListCell *lc; >> + ListCell *l_start = list_head(gd->rollups); >> + AggStrategy strat = AGG_HASHED; >> + Sizehashsize; >> + double exclude_groups = 0.0; >> + >> + Assert(can_hash); >> + >> + if (pathkeys_contained_in(root->group_pathkeys, path->pathkeys)) >> + { >> + unhashable_rollup = lfirst(l_start);
Re: [HACKERS] Hash support for grouping sets
> "Andres" == Andres Freundwrites: Andres> Changes to advance_aggregates() are, in my experience, quite Andres> likely to have performance effects. This needs some Andres> performance tests. [...] Andres> Looks like it could all be noise, but it seems worthwhile to Andres> look into some. Trying to sort out signal from noise when dealing with performance impacts of no more than a few percent is _extremely hard_ these days. Remember this, from a couple of years back? http://tinyurl.com/op9qg8a That's a graph of performance results from tests where the only change being made was adding padding bytes to a function that was never called during the test. Performance differences on the order of 3% were being introduced by doing nothing more than changing the actual memory locations of functions. My latest round of tests on this patch shows a similar effect. Here's one representative test timing (a select count(*) with no grouping sets involved): master: 5727ms gsets_hash: 5949ms gsets_hash + 500 padding bytes: 5680ms Since the effect of padding is going to vary over time as other, unrelated, patches happen, I think I can safely claim that the performance of the patch at least overlaps the noise range of the performance of current code. To get a more definitive result, it would be necessary to run at least some dozens of tests, with different padding sizes, and determine whether the average changes detectably between the two versions. I will go ahead and do this, out of sheer curiosity if nothing else, but the preliminary results suggest there's probably nothing worth holding up the patch for. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
[snip] This thread seems to have gone quiet - is it time for me to just go ahead and commit the thing anyway? Anyone else want to weigh in? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Composite IS NULL again, this time with plpgsql
This came up recently on irc: create type t1 as (a integer, b integer); create type t2 as (p t1, q t1); create function null_t2() returns t2 language sql as $f$ select null::t2; $f$; Now consider the following plpgsql: declare v t2; begin v := null_t2(); raise info 'v is null = %', v is null; -- shows 'f' end; The value of the plpgsql variable v always behaves as if it had been assigned row(row(null,null),row(null,null)) -- which, as we thrashed out in detail sometime last year, isn't the same as row(null,null) or just null and isn't considered null by IS NULL. So there's no non-ugly way to preserve the nullity or otherwise of the function result. The best I could come up with in answer to the original problem (how to detect in plpgsql whether a composite value returned by a function was actually null) was this: declare v t2; v_null boolean; begin select x into v from null_t2() as x where not (x is null); v_null := not found; raise info 'v is null = %', v_null; end; which lacks a certain obviousness (and you have to remember to use not (x is null) rather than x is not null). This obviously happens because plpgsql is storing the variable as an expanded list of fields rather than as a single composite datum, and rebuilding the datum value when it needs to be passed to SQL for evaluation. Without an "isnull" flag for each composite subvalue, this can't regenerate the original datum closely enough to give the same value on an isnull test. What to do about this? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Mark" == Mark Dilgerwrites: Mark> Hi Andrew, Mark> Reviewing the patch a bit more, I find it hard to understand the Mark> comment about passing -1 as a flag for finalize_aggregates. Any Mark> chance you can spend a bit more time word-smithing that code Mark> comment? Actually, ignore that prior response, I'll refactor it a bit to remove the need for the comment at all. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Mark" == Mark Dilgerwrites: Mark> Hi Andrew, Mark> Reviewing the patch a bit more, I find it hard to understand the Mark> comment about passing -1 as a flag for finalize_aggregates. Any Mark> chance you can spend a bit more time word-smithing that code Mark> comment? Sure. How does this look for wording (I'll incorporate it into an updated patch later): /* * Compute the final value of all aggregates for one group. * * This function handles only one grouping set at a time. But in the hash * case, it's the caller's responsibility to have selected the set already, and * we pass in -1 as currentSet here to flag that; this also changes how we * handle the indexing into AggStatePerGroup as explained below. * * Results are stored in the output econtext aggvalues/aggnulls. */ static void finalize_aggregates(AggState *aggstate, AggStatePerAgg peraggs, AggStatePerGroup pergroup, int currentSet) { ExprContext *econtext = aggstate->ss.ps.ps_ExprContext; Datum *aggvalues = econtext->ecxt_aggvalues; bool *aggnulls = econtext->ecxt_aggnulls; int aggno; int transno; /* * If currentSet >= 0, then we're doing sorted grouping, and pergroup is an * array of size numTrans*numSets which we have to index into using * currentSet in addition to transno. The caller may not have selected the * set, so we do that. * * If currentSet < 0, then we're doing hashed grouping, and pergroup is an * array of only numTrans items (since for hashed grouping, each grouping * set is in a separate hashtable). We rely on the caller having done * select_current_set, and we fudge currentSet to 0 in order to make the * same indexing calculations work as for the grouping case. */ if (currentSet >= 0) select_current_set(aggstate, currentSet, false); else currentSet = 0; -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Mark" == Mark Dilgerwrites: Mark> On linux/gcc the patch generates a warning in nodeAgg.c that is Mark> fairly easy to fix. Using -Werror to make catching the error Mark> easier, I get: what gcc version is this exactly? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for grouping sets
> "Thom" == Thom Brownwrites: Thom> This doesn't apply cleanly to latest master. Could you please Thom> post a rebased patch? Sure. -- Andrew (irc:RhodiumToad) diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index c9e0a3e..480a07e 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -996,6 +996,10 @@ ExplainNode(PlanState *planstate, List *ancestors, pname = "HashAggregate"; strategy = "Hashed"; break; + case AGG_MIXED: + pname = "MixedAggregate"; + strategy = "Mixed"; + break; default: pname = "Aggregate ???"; strategy = "???"; @@ -1924,6 +1928,19 @@ show_grouping_set_keys(PlanState *planstate, ListCell *lc; List *gsets = aggnode->groupingSets; AttrNumber *keycols = aggnode->grpColIdx; + const char *keyname; + const char *keysetname; + + if (aggnode->aggstrategy == AGG_HASHED || aggnode->aggstrategy == AGG_MIXED) + { + keyname = "Hash Key"; + keysetname = "Hash Keys"; + } + else + { + keyname = "Group Key"; + keysetname = "Group Keys"; + } ExplainOpenGroup("Grouping Set", NULL, true, es); @@ -1938,7 +1955,7 @@ show_grouping_set_keys(PlanState *planstate, es->indent++; } - ExplainOpenGroup("Group Keys", "Group Keys", false, es); + ExplainOpenGroup(keysetname, keysetname, false, es); foreach(lc, gsets) { @@ -1962,12 +1979,12 @@ show_grouping_set_keys(PlanState *planstate, } if (!result && es->format == EXPLAIN_FORMAT_TEXT) - ExplainPropertyText("Group Key", "()", es); + ExplainPropertyText(keyname, "()", es); else - ExplainPropertyListNested("Group Key", result, es); + ExplainPropertyListNested(keyname, result, es); } - ExplainCloseGroup("Group Keys", "Group Keys", false, es); + ExplainCloseGroup(keysetname, keysetname, false, es); if (sortnode && es->format == EXPLAIN_FORMAT_TEXT) es->indent--; diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index aa08152..f059560 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -122,12 +122,19 @@ * specific). * * Where more complex grouping sets are used, we break them down into - * "phases", where each phase has a different sort order. During each - * phase but the last, the input tuples are additionally stored in a - * tuplesort which is keyed to the next phase's sort order; during each - * phase but the first, the input tuples are drawn from the previously - * sorted data. (The sorting of the data for the first phase is handled by - * the planner, as it might be satisfied by underlying nodes.) + * "phases", where each phase has a different sort order (except phase 0 + * which is reserved for hashing). During each phase but the last, the + * input tuples are additionally stored in a tuplesort which is keyed to the + * next phase's sort order; during each phase but the first, the input + * tuples are drawn from the previously sorted data. (The sorting of the + * data for the first phase is handled by the planner, as it might be + * satisfied by underlying nodes.) + * + * Hashing can be mixed with sorted grouping. To do this, we have an + * AGG_MIXED strategy that populates the hashtables during the first sorted + * phase, and switches to reading them out after completing all sort phases. + * We can also support AGG_HASHED with multiple hash tables and no sorting + * at all. * * From the perspective of aggregate transition and final functions, the * only issue regarding grouping sets is this: a single call site (flinfo) @@ -139,8 +146,6 @@ * sensitive to the grouping set for which the aggregate function is * currently being called. * - * TODO: AGG_HASHED doesn't support multiple grouping sets yet. - * * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * @@ -432,6 +437,7 @@ typedef struct AggStatePerGroupData */ typedef struct AggStatePerPhaseData { + AggStrategy aggstrategy; /* strategy for this phase */ int numsets; /* number of grouping sets (or 0) */ int *gset_lengths; /* lengths of grouping sets */ Bitmapset **grouped_cols; /* column groupings for rollup */ @@ -440,7 +446,31 @@ typedef struct AggStatePerPhaseData Sort *sortnode; /* Sort node for input ordering for phase */ } AggStatePerPhaseData; +/* + * AggStatePerHashData - per-hashtable state + * + * When doing grouping sets with hashing, we have one of these for each + * grouping set. (When doing hashing without grouping sets, we have just one of + * them.) + */ + +typedef struct AggStatePerHashData +{ + TupleHashTable hashtable; /* hash table with one entry per group */ + TupleHashIterator hashiter; /* for iterating through hash table */ + TupleTableSlot *hashslot; /* slot for loading hash table */ + FmgrInfo *hashfunctions; /*
Re: [HACKERS] smallint out of range EXECUTEing prepared statement
> "Justin" == Justin Pryzbywrites: Justin> Is this expected behavior ? Justin> ts=# SELECT * FROM t WHERE site_id=32768 LIMIT 1; Justin> (0 rows) Justin> ts=# PREPARE x AS SELECT * FROM t WHERE site_id=$1 LIMIT 1; Justin> PREPARE Justin> ts=# EXECUTE x(32768); Justin> ERROR: smallint out of range If column "site_id" is of type smallint, then parse analysis will deduce a type of smallint for $1, which is otherwise of unknown type. So the prepared statement "x" then has one parameter of type smallint. Passing 32768 for that parameter therefore fails with the expected error. Justin> ts=# PREPARE y AS SELECT * FROM t WHERE site_id::int=$1 LIMIT 1; Justin> PREPARE Now $1 is of type integer, not smallint, because parse analysis sees (integer = unknown) and deduces the type from that. (a better way would be WHERE site_id = $1::integer, which would allow index usage on site_id, unlike your example) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Couple of issues with prepared FETCH commands
> "Robert" == Robert Haaswrites: >> But the problem that actually came up is this: if you do the >> PQprepare before the named cursor has actually been opened, then >> everything works _up until_ the first event, such as a change to >> search_path, that forces a revalidation; and at that point it fails >> with the "must not change result type" error _even if_ the cursor >> always has exactly the same result type. This happens because the >> initial prepare actually stored NULL for plansource->resultDesc, >> since the cursor name wasn't found, while on the revalidate, when >> the cursor obviously does exist, it gets the actual result type. >> >> It seems a bit of a "gotcha" to have it fail in this case when the >> result type isn't actually being checked in other cases. Robert> To me, that sounds like a bug. So what's the appropriate fix? My suggestion would be to suppress the result type check entirely for utility statements; EXPLAIN and SHOW always return the same thing anyway, and both FETCH and EXECUTE are subject to the issue described. This would mean conceding that the result descriptor of a prepared FETCH or EXECUTE might change (i.e. a Describe of the statement might not be useful, though a Describe of an opened portal would be ok). I think this would result in the most obviously correct behavior from the client point of view. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
> "Daniel" == Daniel Veritewrites: Daniel> Consider the case of a table with a SERIAL column which later Daniel> has to become a BIGINT due to growth. Currently a user would Daniel> just alter the column's type and does need to do anything with Daniel> the sequence. Daniel> With the patch, it becomes a problem because Daniel> - ALTER SEQUENCE seqname MAXVALUE new_value Daniel> will fail because new_value is beyond the range of INT4. Daniel> - ALTER SEQUENCE seqname TYPE BIGINT Daniel> does not exist (yet?) Daniel> - DROP SEQUENCE seqname (with the idea of recreating the Daniel> sequence immediately after) will be rejected because the table Daniel> depends on the sequence. Daniel> What should a user do to upgrade the SERIAL column? Something along the lines of: begin; alter table tablename alter column id drop default; alter sequence tablename_id_seq owned by none; create sequence tablename_id_seq2 as bigint owned by tablename.id; select setval('tablename_id_seq2', last_value, is_called) from tablename_id_seq; drop sequence tablename_id_seq; alter table tablename alter column id type bigint; alter table tablename alter column id set default nextval('tablename_id_seq2'); commit; Not impossible, but not at all obvious and quite involved. (And -1 for this feature unless this issue is addressed.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Couple of issues with prepared FETCH commands
(This came up on IRC, but I'm not sure to what extent it should be considered a "bug") If you do PQprepare(conn, "myfetch", "FETCH ALL FROM mycursor", ...); then the results are unpredictable in two ways: Firstly, nothing causes the plancache entry to be revalidated just because "mycursor" got opened with a different query, so the result type can change between uses. This could be considered a "caveat user" case, though, and I can't find anything that actually breaks. But the problem that actually came up is this: if you do the PQprepare before the named cursor has actually been opened, then everything works _up until_ the first event, such as a change to search_path, that forces a revalidation; and at that point it fails with the "must not change result type" error _even if_ the cursor always has exactly the same result type. This happens because the initial prepare actually stored NULL for plansource->resultDesc, since the cursor name wasn't found, while on the revalidate, when the cursor obviously does exist, it gets the actual result type. It seems a bit of a "gotcha" to have it fail in this case when the result type isn't actually being checked in other cases. (In the reported case, search_path was actually changing due to the creation of a temp table, so there was a certain amount of spooky-action-at-a-distance to figure out in order to locate the problem.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Teaching query_planner to handle multiple sort orders?
> "Tom" == Tom Lanewrites: >> Of course there is one good solution, which is to have query_planner >> take a set of acceptable output sort orders rather than just a >> single one. >> How wild an idea is this? Tom> It's been on my to-do list for years, see e.g. Tom> https://postgr.es/m/5702.1480896...@sss.pgh.pa.us Tom> The idea wouldn't have been very helpful before the upper planner Tom> got path-ified, but now the only stumbling block is a shortage of Tom> round tuits. I'll take a shot at it, then. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Teaching query_planner to handle multiple sort orders?
So in the grouping sets patch post, I said: >> There is one current weakness which I don't see a good solution for: >> the planner code still has to pick a single value for group_pathkeys >> before planning the input path. This means that we sometimes can't >> choose a minimal set of sorts, because we may have chosen a sort >> order for a grouping set that should have been hashed, Of course there is one good solution, which is to have query_planner take a set of acceptable output sort orders rather than just a single one. How wild an idea is this? I guess a possible hazard is an increase in the number of possible mergejoin paths to consider. What other possible issues are there? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hash support for grouping sets
Herewith a patch for doing grouping sets via hashing or mixed hashing and sorting. The principal objective is to pick whatever combination of grouping sets has an estimated size that fits in work_mem, and minimizes the number of sorting passes we need to do over the data, and hash those. (Yes, this is a knapsack problem.) This is achieved by adding another strategy to the Agg node; AGG_MIXED means that both hashing and (sorted or plain) grouping happens. In addition, support for multiple hash tables in AGG_HASHED mode is added. Sample plans look like this: explain select two,ten from onek group by cube(two,ten); QUERY PLAN -- MixedAggregate (cost=0.00..89.33 rows=33 width=8) Hash Key: two, ten Hash Key: two Hash Key: ten Group Key: () -> Seq Scan on onek (cost=0.00..79.00 rows=1000 width=8) explain select two,ten from onek group by two, cube(ten,twenty); QUERY PLAN --- HashAggregate (cost=86.50..100.62 rows=162 width=12) Hash Key: two, ten, twenty Hash Key: two, ten Hash Key: two Hash Key: two, twenty -> Seq Scan on onek (cost=0.00..79.00 rows=1000 width=12) set work_mem='64kB'; explain select count(*) from tenk1 group by grouping sets (unique1,thousand,hundred,ten,two); QUERY PLAN MixedAggregate (cost=1535.39..3023.89 rows=2 width=28) Hash Key: two Hash Key: ten Hash Key: hundred Group Key: unique1 Sort Key: thousand Group Key: thousand -> Sort (cost=1535.39..1560.39 rows=1 width=20) Sort Key: unique1 -> Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=20) (10 rows) Columns with unhashable or unsortable data types are handled appropriately, though obviously every individual grouping set must end up containing compatible column types. There is one current weakness which I don't see a good solution for: the planner code still has to pick a single value for group_pathkeys before planning the input path. This means that we sometimes can't choose a minimal set of sorts, because we may have chosen a sort order for a grouping set that should have been hashed, for example: explain select count(*) from tenk1 group by grouping sets (two,unique1,thousand,hundred,ten); QUERY PLAN MixedAggregate (cost=1535.39..4126.28 rows=2 width=28) Hash Key: ten Hash Key: hundred Group Key: two Sort Key: unique1 Group Key: unique1 Sort Key: thousand Group Key: thousand -> Sort (cost=1535.39..1560.39 rows=1 width=20) Sort Key: two -> Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=20) (3 sorts, vs. 2 in the previous example that listed the same grouping sets in different order) Current status of the work: probably still needs cleanup, maybe some better regression tests, and Andres has expressed some concern over the performance impact of additional complexity in advance_aggregates; it would be useful if this could be tested. But it should be reviewable as-is. -- Andrew (irc:RhodiumToad) diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index c762fb0..6c787ea 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -972,6 +972,10 @@ ExplainNode(PlanState *planstate, List *ancestors, pname = "HashAggregate"; strategy = "Hashed"; break; + case AGG_MIXED: + pname = "MixedAggregate"; + strategy = "Mixed"; + break; default: pname = "Aggregate ???"; strategy = "???"; @@ -1900,6 +1904,19 @@ show_grouping_set_keys(PlanState *planstate, ListCell *lc; List *gsets = aggnode->groupingSets; AttrNumber *keycols = aggnode->grpColIdx; + const char *keyname; + const char *keysetname; + + if (aggnode->aggstrategy == AGG_HASHED || aggnode->aggstrategy == AGG_MIXED) + { + keyname = "Hash Key"; + keysetname = "Hash Keys"; + } + else + { + keyname = "Group Key"; + keysetname = "Group Keys"; + } ExplainOpenGroup("Grouping Set", NULL, true, es); @@ -1914,7 +1931,7 @@ show_grouping_set_keys(PlanState *planstate, es->indent++; } - ExplainOpenGroup("Group Keys", "Group Keys", false, es); + ExplainOpenGroup(keysetname, keysetname, false, es); foreach(lc, gsets) { @@ -1938,12 +1955,12 @@ show_grouping_set_keys(PlanState *planstate, } if (!result && es->format == EXPLAIN_FORMAT_TEXT) - ExplainPropertyText("Group Key", "()", es); + ExplainPropertyText(keyname, "()", es); else - ExplainPropertyListNested("Group Key", result, es); +
Re: [HACKERS] Strange result with LATERAL query
> "Tom" == Tom Lanewrites: Tom> Hm, I was just working on inserting something of the sort into Tom> ExecInitAgg. But I guess we could do it in the planner too. Will Tom> run with your approach. Tom> I think it's a bit too stupid as-is, though. We don't need to Tom> recalculate for Params in aggdirectargs, do we? In theory we would need to. But in practice we don't, because we don't allow ordered aggs in AGG_HASHED mode anyway. We could skip filling in aggParam at all if not in AGG_HASHED mode I guess. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange result with LATERAL query
> "Pavel" == Pavel Stehulewrites: Pavel> The result should not depend on GUC - hashagg on/off changing Pavel> output - it is error. I don't think anyone's suggesting leaving it unfixed, just whether the fix should introduce unnecessary rescans of the aggregate input. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange result with LATERAL query
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: >>>>> "Tom" == Tom Lane <t...@sss.pgh.pa.us> writes: Tom> I'm not sure if it's worth trying to distinguish whether the Param Tom> is inside any aggregate calls or not. How about: -- Andrew (irc:RhodiumToad) diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index 1ec2515..4a9fefb 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -3426,10 +3426,11 @@ ExecReScanAgg(AggState *node) /* * If we do have the hash table and the subplan does not have any - * parameter changes, then we can just rescan the existing hash table; - * no need to build it again. + * parameter changes, we might still need to rebuild the hash if any of + * the parameter changes affect the input to aggregate functions. */ - if (outerPlan->chgParam == NULL) + if (outerPlan->chgParam == NULL + && !bms_overlap(node->ss.ps.chgParam, aggnode->aggParam)) { ResetTupleHashIterator(node->hashtable, >hashiter); return; diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index c7a0644..7b5eb86 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -871,6 +871,7 @@ _copyAgg(const Agg *from) COPY_SCALAR_FIELD(aggstrategy); COPY_SCALAR_FIELD(aggsplit); COPY_SCALAR_FIELD(numCols); + COPY_BITMAPSET_FIELD(aggParam); if (from->numCols > 0) { COPY_POINTER_FIELD(grpColIdx, from->numCols * sizeof(AttrNumber)); diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 1fab807..5e48edd 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -706,6 +706,7 @@ _outAgg(StringInfo str, const Agg *node) WRITE_ENUM_FIELD(aggstrategy, AggStrategy); WRITE_ENUM_FIELD(aggsplit, AggSplit); WRITE_INT_FIELD(numCols); + WRITE_BITMAPSET_FIELD(aggParam); appendStringInfoString(str, " :grpColIdx"); for (i = 0; i < node->numCols; i++) diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index c83063e..67dcf17 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -2004,6 +2004,7 @@ _readAgg(void) READ_ENUM_FIELD(aggstrategy, AggStrategy); READ_ENUM_FIELD(aggsplit, AggSplit); READ_INT_FIELD(numCols); + READ_BITMAPSET_FIELD(aggParam); READ_ATTRNUMBER_ARRAY(grpColIdx, local_node->numCols); READ_OID_ARRAY(grpOperators, local_node->numCols); READ_LONG_FIELD(numGroups); diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c index a46cc10..2e56484 100644 --- a/src/backend/optimizer/plan/subselect.c +++ b/src/backend/optimizer/plan/subselect.c @@ -82,6 +82,7 @@ static Bitmapset *finalize_plan(PlannerInfo *root, Bitmapset *valid_params, Bitmapset *scan_params); static bool finalize_primnode(Node *node, finalize_primnode_context *context); +static bool finalize_agg_primnode(Node *node, finalize_primnode_context *context); /* @@ -2659,8 +2660,30 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params, ); break; - case T_Hash: case T_Agg: + { +Agg *agg = (Agg *) plan; +finalize_primnode_context aggcontext; + +/* + * We need to know which params are referenced in inputs to + * aggregate calls, so that we know whether we need to rebuild + * the hashtable in the AGG_HASHED case. Rescan the tlist and + * qual for this purpose. + */ + +aggcontext = context; +aggcontext.paramids = NULL; + +finalize_agg_primnode((Node *) agg->plan.targetlist, ); +finalize_agg_primnode((Node *) agg->plan.qual, ); + +/* remember results for execution */ +agg->aggParam = aggcontext.paramids; + } + break; + + case T_Hash: case T_Material: case T_Sort: case T_Unique: @@ -2812,6 +2835,24 @@ finalize_primnode(Node *node, finalize_primnode_context *context) } /* + * finalize_agg_primnode: add IDs of all PARAM_EXEC params appearing inside + * Aggref nodes in the given expression tree to the result set. + */ +static bool +finalize_agg_primnode(Node *node, finalize_primnode_context *context) +{ + if (node == NULL) + return false; + if (IsA(node, Aggref)) + { + finalize_primnode(node, context); + return false; /* no more to do here */ + } + return expression_tree_walker(node, finalize_agg_primnode, + (void *) context); +} + +/* * SS_make_initplan_output_param - make a Param for an initPlan's output * * The plan is expected to return a scalar value of the given type/collation. diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h index 369179f..3d5e4df 100644 --- a/src/include/nodes/plannodes.h +++ b/src/include/nodes/plannodes.h @@ -712,6 +712,7 @@ typedef struct Agg AggStrategy aggstrategy; /* basic strategy, see nodes.h */ AggSplit
Re: [HACKERS] Strange result with LATERAL query
> "Tom" == Tom Lanewrites: Tom> I'm not sure if it's worth trying to distinguish whether the Param Tom> is inside any aggregate calls or not. The existing code gets the Tom> right answer for Tom> select array(select x+sum(y) from generate_series(1,3) y group by y) Tom> from generate_series(1,3) x; Tom> and we'd be losing some efficiency for cases like that if we fix Tom> it as above. But is it worth the trouble? The loss of efficiency could be significant, since it's forcing a rescan of what could be an expensive plan. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange result with LATERAL query
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: >>>>> "Jeevan" == Jeevan Chalke <jeevan.cha...@enterprisedb.com> writes: Jeevan> Hi, Jeevan> While playing with LATERAL along with some aggregates in Jeevan> sub-query, I have observed somewhat unusual behavior. Andrew> Simpler example not needing LATERAL: Andrew> select array(select sum(x+y) from generate_series(1,3) y group by y) Andrew> from generate_series(1,3) x; Oh, and another way to verify that it's a bug and not expected behavior is that it goes away with set enable_hashagg=false; -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange result with LATERAL query
> "Jeevan" == Jeevan Chalkewrites: Jeevan> Hi, Jeevan> While playing with LATERAL along with some aggregates in Jeevan> sub-query, I have observed somewhat unusual behavior. Simpler example not needing LATERAL: select array(select sum(x+y) from generate_series(1,3) y group by y) from generate_series(1,3) x; ?column? -- {2,4,3} {2,4,3} {2,4,3} (3 rows) This is broken all the way back, it's not a recent bug. Something is wrong with the way chgParam is being handled in Agg nodes. The code in ExecReScanAgg seems to assume that if the lefttree doesn't have any parameter changes then it suffices to re-project the data from the existing hashtable; but of course this is nonsense if the parameter is in an input to an aggregate function. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SP-GiST support for inet datatypes
> "Tom" == Tom Lanewrites: > Emre Hasegeli writes: >> Attached patches add SP-GiST support to the inet datatypes. Tom> I started to look at this patch. The reported speedup is pretty Tom> nice, but ... The builtin gist support for inet seems quite surprisingly slow; ip4r beats it into the ground without difficulty. (I'd be curious how well this sp-gist version stacks up against ip4r.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] C++ port of Postgres
> "Robert" == Robert Haaswrites: Robert> Hmm, so sizeof() has different semantics in C vs. C++? No. '1' has different semantics in C vs C++. (In C, '1' is an int, whereas in C++ it's a char. It so happens that (sizeof '1') is the only case which is valid in both C and C++ where this makes a difference.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXLCUDE constraints and Hash indexes
> "Jeff" == Jeff Janeswrites: Jeff> From: https://www.postgresql.org/docs/9.4/static/sql-createtable.html Jeff> "The access method must support amgettuple (see Chapter 55); at Jeff> present this means GIN cannot be used. Although it's allowed, there is Jeff> little point in using B-tree or hash indexes with an exclusion Jeff> constraint, because this does nothing that an ordinary unique Jeff> constraint doesn't do better. So in practice the access method will Jeff> always be GiST or SP-GiST." I also recently found a case where using btree exclusion constraints was useful: a unique index on an expression can't be marked deferrable, but the equivalent exclusion constraint can be. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
Latest patch. Names and scopes are as per discussion. New files for code and regression test. Docs included. -- Andrew (irc:RhodiumToad) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7830334..4552a74 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16290,6 +16290,18 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); +pg_indexam_has_property + + + +pg_index_column_has_property + + + +pg_index_has_property + + + pg_options_to_table @@ -16477,6 +16489,21 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); number of columns, pretty-printing is implied + pg_indexam_has_property(am_oid, prop_name) + boolean + Test whether an index access method has a specified property + + + pg_index_column_has_property(index_oid, column_no, prop_name) + boolean + Test whether an index column has a specified property + + + pg_index_has_property(index_oid, prop_name) + boolean + Test whether the access method for the specified index has a specified property + + pg_options_to_table(reloptions) setof record get the set of storage option name/value pairs @@ -16620,6 +16647,141 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); + pg_indexam_has_property, + pg_index_has_property, and + pg_index_column_has_property return whether the + specified access method, index, or index column possesses the named + property. NULL is returned if the property name is not + known; true if the property is present, + false if it is not. Refer to +for column properties, +for index properties, and +for access method properties. + + + + Index Column Properties + + + NameDescription + + + + asc + Does the column sort in ascending order on a forward scan? + + + + desc + Does the column sort in descending order on a forward scan? + + + + nulls_first + Does the column sort with nulls first on a forward scan? + + + + nulls_last + Does the column sort with nulls last on a forward scan? + + + + orderable + Does the column possess any ordering properties such + as ASC or DESC + + + distance_orderable + Can the column be returned in order by a "distance" operator, + for example ORDER BY col <-> constant + + + returnable + Can the column value be returned in an index-only scan? + + + + search_array + Does the column support array queries with ANY + natively in the index AM? + + + search_nulls + Does the column support IS NULL or + IS NOT NULL conditions in the index? + + + + + + + + Index Properties + + + NameDescription + + + + clusterable + Can this index be used in a CLUSTER operation? + + + + backward_scan + Can this index be scanned in the reverse direction? + + + + index_scan + Does this index support plain (non-bitmap) scans? + + + + bitmap_scan + Does this index support bitmap scans? + + + + + + + + Index Access Method Properties + + + NameDescription + + + + can_order + Does this access method support ASC, + DESC and related keywords on columns in + CREATE INDEX? + + + + can_unique + Does this access method support + CREATE UNIQUE INDEX? + + + + can_multi_col + Does this access method support multiple columns? + + + + can_exclude + Does this access method support exclusion constraints? + + + + + + + pg_options_to_table returns the set of storage option name/value pairs (option_name/option_value) when passed diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index e8034b9..fece954 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -16,11 +16,14 @@ #include "access/gist_private.h" #include "access/gistscan.h" +#include "access/htup_details.h" #include "catalog/pg_collation.h" +#include "catalog/pg_opclass.h" #include "miscadmin.h" #include "utils/index_selfuncs.h" #include "utils/memutils.h" #include "utils/rel.h" +#include "utils/syscache.h" /* non-export function prototypes */ @@ -87,6 +90,7 @@ gisthandler(PG_FUNCTION_ARGS) amroutine->amendscan = gistendscan; amroutine->ammarkpos = NULL; amroutine->amrestrpos = NULL; + amroutine->amproperty = gistproperty; PG_RETURN_POINTER(amroutine); } @@ -1490,6 +1494,98 @@ freeGISTstate(GISTSTATE *giststate) MemoryContextDelete(giststate->scanCxt); } + +/* + * gistproperty()
Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
> "Greg" == Greg Starkwrites: >> No, because as the pages split, they fill more slowly (because there >> are now more pages). So on average in a large randomly filled index, >> pages spend more time nearer 50% full than 100% full. This is easy >> to demonstrate by creating a table with an indexed float8 column and >> adding batches of random() values to it, checking with pgstatindex >> at intervals - the average leaf density will rarely exceed 70%. >> >> However, worst case conditions can give lower leaf densities; >> obviously the worst case is if the data is loaded in an order and >> quantity that just happens to leave every leaf page recently split. Greg> btree pages don't split 50/50 either. They split biased to assume Greg> the greater side of the split will receive more inserts -- iirc Greg> 70/30. Hmm? The code in _bt_findsplitloc and _bt_checksplitloc doesn't seem to agree with this. (Inserting on the high leaf page is a special case, which is where the fillfactor logic kicks in; that's why sequentially filled indexes are (by default) 90% full rather than 100%. But other pages split into roughly equal halves.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: Tom> But we need to be clear in the documentation about what this Tom> property actually means. My objection to having it answer at the Tom> index or column level is basically that that encourages confusion Tom> as to what it means. OK. Here's new output with the scope changes, and some of the names changed in an attempt to make them clearer: cap | AM | Index | Column ++---+ asc|| | t desc || | f nulls_first|| | f nulls_last || | t orderable || | t distance_orderable || | f returnable || | t search_array || | t search_nulls || | t clusterable|| t | backward_scan || t | index_scan || t | bitmap_scan|| t | can_order | t | | can_unique | t | | can_multi_col | t | | can_exclude| t | | (17 rows) (The can_* names are reserved for the AM level where we're asking about the abstract capabilities of the AM.) Better? Worse? Any more improvements to the names? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: >> This table shows what properties are exposed at the AM-wide level, >> the per-index level and the per-column level. Tom> +1 mostly, but I'm a bit bemused by can_order and can_backward Tom> having different scopes --- how come? That's where they were in the previous list, a couple of messages up in the thread... Currently can_backward is always the same for all indexes in the same AM, but I guess there's no guarantee that won't change. In the previous message you suggested it might have to be per-column, even, but I think that makes no sense (either the whole index is scannable in both directions or it is not, it can't be different per column). Tom> Also, not sure about allowing things like can_multi_col as index Tom> or column properties. That seems a bit silly: whether or not the Tom> AM can do multi columns, a specific index is what it is. I'd be a Tom> bit inclined to have those return null except at the AM level. I thought it would be cleaner to be able to query all properties at the most specific level. Tom> In particular, I'm not sure what you intend to mean by applying Tom> can_unique at the index or column level. Is that supposed to mean Tom> that the index or column *is* unique? No. (We could add properties like is_unique, is_exclusion which clients currently query in pg_index; should we?) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: >> distance_orderable now returns true/false depending on the opclass, >> not just on the amcanorderbyop field. In order to do this, I've >> added an optional amproperty function to the AM api, which if it >> exists, gets first dibs on all property calls so it can override the >> result as it sees fit. Tom> Hmm, seems like for that case, it'd be easier to look into pg_amop Tom> and see if the opclass has any suitably-marked operators. I thought about that, but it seemed like it could get painful. The planner is working forwards from a known operator and matching it against the index column, whereas we'd have to work backwards from the opfamily, and there's no good existing index for this; in the presence of binary-compatible types, I don't think even amoplefttype can be assumed (e.g. a varchar column can be ordered by pg_trgm's <-> operator which is declared for text). So it'd have to be the equivalent of: get index column's opclass oid look it up in pg_opclass to get opfamily for r in select * from pg_amop where amopfamily=? and amoppurpose='o' if r.amoplefttype is binary-coercible from the index column's type then return true As opposed to what I have now, which is: get index column's opclass oid look it up in pg_opclass to get opfamily/opcintype result = SearchSysCacheExists4(AMPROCNUM, ...) (in theory this could produce a false positive if there's a distance function but no actual operators to reference it, but I think that's the opclass author's issue) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
> "Jeff" == Jeff Janeswrites: Jeff> But shouldn't that still leave us with a 75% full index, rather Jeff> than slightly over 50% full? Average is usually about 67%-70%. (For capacity estimation I always assume 66% for a non-sequentially-filled btree.) Jeff> The leaf pages start at 50%, grow to 100%, then split back to Jeff> 50%, then grow back to 100%. So the average should be about 75%. No, because as the pages split, they fill more slowly (because there are now more pages). So on average in a large randomly filled index, pages spend more time nearer 50% full than 100% full. This is easy to demonstrate by creating a table with an indexed float8 column and adding batches of random() values to it, checking with pgstatindex at intervals - the average leaf density will rarely exceed 70%. However, worst case conditions can give lower leaf densities; obviously the worst case is if the data is loaded in an order and quantity that just happens to leave every leaf page recently split. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
So I'm tidying up and doing docs for the next version of this patch, but here for comment is the current functionality: select cap, pg_indexam_has_property(a.oid, cap) as "AM", pg_index_has_property('onek_hundred'::regclass, cap) as "Index", pg_index_column_has_property('onek_hundred'::regclass, 1, cap) as "Column" from pg_am a, unnest(array['asc', 'desc', 'nulls_first', 'nulls_last', 'orderable', 'distance_orderable', 'can_order', 'can_unique', 'can_multi_col', 'can_exclude', 'can_backward', 'can_cluster', 'index_scan', 'bitmap_scan', 'can_return', 'search_array', 'search_nulls']) with ordinality as u(cap,ord) where a.amname='btree' order by ord; cap | AM | Index | Column ++---+ asc|| | t desc || | f nulls_first|| | f nulls_last || | t orderable || | t distance_orderable || | f can_order | t | t | t can_unique | t | t | t can_multi_col | t | t | t can_exclude| t | t | t can_backward || t | t can_cluster|| t | t index_scan || t | t bitmap_scan|| t | t can_return || | t search_array || | t search_nulls || | t (17 rows) This table shows what properties are exposed at the AM-wide level, the per-index level and the per-column level. distance_orderable now returns true/false depending on the opclass, not just on the amcanorderbyop field. In order to do this, I've added an optional amproperty function to the AM api, which if it exists, gets first dibs on all property calls so it can override the result as it sees fit. can_return likewise reflects the result of index_can_return. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why is box <-> point missing, and box <-> box not indexable?
point <-> point, circle <-> point and polygon <-> point all exist as orderable-by-operator operators (in fact they are the only ones by default). But there's no box <-> point operator at all, and no index support for box <-> box. Was this intentional, or just a strange oversight? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: >> - this still has everything in amapi.c rather than creating any new >> files. Also, the regression tests are in create_index.sql for lack >> of any obviously better place. Tom> This more than doubles the size of amapi.c, so it has a definite Tom> feel of tail-wags-dog for me, even if that seemed like an Tom> appropriate place otherwise which it doesn't really. I think a Tom> new .c file under adt/ is the way to go, with extern declarations Tom> in builtins.h. Yeah, I'm fine with that. adt/amutils.c unless I see some better suggestion. Tom> Maybe we need a new regression test case file too. I don't much Tom> like adding this to create_index because (a) it doesn't Tom> particularly seem to match that file's purpose of setting up Tom> indexes on the standard regression test tables, and (b) that file Tom> is a bottleneck in parallel regression runs because it can't run Tom> in parallel with much else. Good point. I looked around to see if anything was testing pg_get_indexdef, thinking that that would be a good place, but it seems that pg_get_indexdef is tested only in incidental ways (in collate and rules, the latter of which tests it only with invalid input). I'll do the next version with a new file, unless a better idea shows up. >> Comments? Tom> Why did you go with "capability" rather than "property" in the Tom> exposed function names? The latter seems much closer to le mot Tom> juste, especially for things like asc/desc. The first version (which dealt only with AMs) went with "capability" because it was dealing with what the AM _could_ do rather than what was defined on any specific index. The second version added pg_index_column_has_property because that was the name that had been used in discussion. Changing them all to "property" would be more consistent I suppose. Tom> I'd personally cut the list of pg_am replacement properties way Tom> back, as I believe much of what you've got there is not actually Tom> of use to applications, and some of it is outright Tom> counterproductive. An example is exposing amcanreturn as an Tom> index-AM boolean. For AM-wide properties, it may be that they have to be considered "lossy" when tested against the AM oid rather than on an individual index or column - at the AM level, "false" might mean "this won't work" while "true" would mean "this might work sometimes, not guaranteed to work on every index". The documentation should probably indicate this. So these properties (I've changed all the names here, suggestions welcome for better ones) I think should be testable on the AM, each with an example of why: can_order - if this is false, an admin tool shouldn't try and put ASC or DESC in a CREATE INDEX can_unique - if this is false, an admin tool might, for example, want to not offer the user the option of CREATE UNIQUE INDEX with this AM can_multi_col - if this is false, an admin tool might want to allow the user to select only one column can_exclude - a new property that indicates whether the AM can be used for exclusion constraints; at present this matches "amgettuple" but that implementation detail should of course be hidden (One possible refinement here could be to invert the sense of all of these, making them no_whatever, so that "false" and "null" could be treated the same by clients. Or would that be too confusing?) These could be limited to being testable only on a specified index, and not AM-wide: can_backward clusterable index_scan bitmap_scan optional_key (? maybe) predicate_locks (? maybe) And these for individual columns: can_return search_array (? maybe) search_nulls (? maybe) operator_orderable (or distance_orderable? what's a good name?) orderable asc desc nulls_first nulls_last A question: implementing can_return as a per-column property looks like it requires actually opening the index rel, rather than just consulting the syscache the way that most pg_get_* functions do. Should it always open it, or only for properties that need it? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
Updated patch. Changes: - returns NULL rather than "cache lookup failed" - added pg_index_column_has_property (incl. docs) - added regression tests Not changed / need consideration: - this still has everything in amapi.c rather than creating any new files. Also, the regression tests are in create_index.sql for lack of any obviously better place. The list of column properties is: ordered - (same as "amcanorder" AM capability) ordered_asc ordered_desc ordered_nulls_first ordered_nulls_last If "ordered" is true then exactly one of _asc/_desc and exactly one of _nulls_first/_last will be true; if "ordered" is false then all the others will be false too. The intended usage is something like CASE WHEN pg_index_column_has_property(idx, attno, 'ordered_asc') THEN 'ASC' WHEN pg_index_column_has_property(idx, attno, 'ordered_desc') THEN 'DESC' ELSE '' -- or NULL END Comments? -- Andrew (irc:RhodiumToad) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index ccb9b97..684f7b3 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -577,6 +577,89 @@ + +Capability information formerly stored in pg_am +is now available via the functions +pg_indexam_has_capability and +pg_index_has_capability +(see ). The following +boolean-valued capability names are currently supported: + + + + Capabilities + + + + + Name + Description + + + + + amcanorder + Does the access method support ordered scans sorted by the + indexed column's value? + + + amcanorderbyop + Does the access method support ordered scans sorted by the result + of an operator on the indexed column? + + + amcanbackward + Does the access method support backward scanning? + + + amcanunique + Does the access method support unique indexes? + + + amcanmulticol + Does the access method support multicolumn indexes? + + + amoptionalkey + Does the access method support a scan without any constraint + for the first index column? + + + amsearcharray + Does the access method support ScalarArrayOpExpr searches? + + + amsearchnulls + Does the access method support IS NULL/NOT NULL searches? + + + amstorage + Can index storage data type differ from column data type? + + + amclusterable + Can an index of this type be clustered on? + + + ampredlocks + Does an index of this type manage fine-grained predicate locks? + + + amgettuple + Does the access method provide an amgettuple function? + + + amgetbitmap + Does the access method provide an amgetbitmap function? + + + amcanreturn + Does the access method support index-only scans? + + + + + diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7830334..d2fe506 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16290,6 +16290,18 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); +pg_indexam_has_capability + + + +pg_index_column_has_property + + + +pg_index_has_capability + + + pg_options_to_table @@ -16477,6 +16489,21 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); number of columns, pretty-printing is implied + pg_indexam_has_capability(am_oid, cap_name) + boolean + Test whether an index access method has a specified capability + + + pg_index_column_has_property(index_oid, column_no, prop_name) + boolean + Test whether an index column has a specified property + + + pg_index_has_capability(index_oid, cap_name) + boolean + Test whether the access method for the specified index has a specified capability + + pg_options_to_table(reloptions) setof record get the set of storage option name/value pairs @@ -16620,6 +16647,73 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); + pg_indexam_has_capability and + pg_index_has_capability return whether the specified + access method, or the access method for the specified index, advertises the + named capability. NULL is returned if the capability + name is not known; true if the capability is advertised, + false if it is not. Refer + to for capability names and their meanings. + + + + pg_index_column_has_property returns whether the + specified index column possesses the named property. + NULL is returned if the property name is not + known; true if the property is present, + false if it is not. Index column property names and the + matching clauses of CREATE INDEX are given in + . + + + + Index Column
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Alvaro" == Alvaro Herrerawrites: >> One idea is utils/adt/misc.c. Or we could make a new file under >> utils/adt/ though I'm not very sure what to name it. amaccess.c? >> catutils.c? If there's only ever likely to be one or two functions >> of this ilk, maybe a new file is overkill and we should just use >> misc.c. Alvaro> I like the idea of a new file; I have a hunch that it will Alvaro> grow, given that we're expanding in this area, and perhaps we Alvaro> can find some existing stuff to relocate there in the future. Alvaro> I don't think a small file is a problem, anyway. Alvaro> How about amfuncs.c? Maybe it can live in catalog/ instead of Alvaro> utils/adt? Well, the existing patch used access/index/amapi.c for the AM capability functions. There may be some merit in keeping everything together - I asked because it didn't seem at first glance that the index column property function belonged there, but on second thought there's some overlap in that in future, if indoptions ever acquires any AM-specific flags, it may be necessary for pg_index_column_has_property to call into an AM-specific function. So, here are some options: 1. Put everything in access/index/amapi.c 2. Move either all of access/index/amapi.c, or just the SQL-callable part of it (amvalidate), to utils/adt/amfuncs.c and put new stuff in there 3. put pg_index[am]_has_capability in access/index/amapi.c and pg_index_column_has_property in utils/adt/misc.c -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Kevin" == Kevin Grittnerwrites: >>> Building on the has-property approach Andrew suggested, I wonder if >>> we need something like pg_index_column_has_property(indexoid, colno, >>> propertyname) with properties like "sortable", "desc", "nulls first". >> >> This seems simple enough, on the surface. Why not run with this design? Kevin> Andrew's patch, plus this, covers everything I can think of. Where'd be a good place to put that function? ruleutils? catalog/index.c ? (ruleutils is way too big already) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity with NOT IN
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: Andrew> The easy to catch case, I think, is when the targetlist of the Andrew> IN or NOT IN subquery contains vars of the outer query level Andrew> but no vars of the inner one and no volatile functions. This Andrew> can be checked for with a handful of lines in the parser or a Andrew> couple of dozen lines in a plugin module (though one would have Andrew> to invent an error code, none of the existing WARNING sqlstates Andrew> would do). Actually thinking about this, if you did it in a module, you'd probably want to make it an ERROR not a WARNING, because you'd want to actually stop queries like delete from t1 where x in (select x from table_with_no_column_x); rather than let them run. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity with NOT IN
> "Pavel" == Pavel Stehulewrites: >> Well now I feel dumb... >> >> It would be very useful if we had some way to warn users about stuff >> like this. Emitting a NOTICE comes to mind. Pavel> This can be valid query It can be, but it essentially never is. The cases where you genuinely want a correlated IN query are rare, but even then there would be something in the targetlist that referenced the inner query. The easy to catch case, I think, is when the targetlist of the IN or NOT IN subquery contains vars of the outer query level but no vars of the inner one and no volatile functions. This can be checked for with a handful of lines in the parser or a couple of dozen lines in a plugin module (though one would have to invent an error code, none of the existing WARNING sqlstates would do). Maybe David Fetter's suggested module for catching missing WHERE clauses could be expanded into a more general SQL-'Lint' module? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Bruce" == Bruce Momjianwrites: Bruce> Would it be helpful to output an array of strings representing Bruce> the index definition? >> Why would that help, if the point is to enable programmatic access >> to information? Bruce> I was thinking an array of strings would avoid problems in Bruce> having to re-scan the output for tokens. OK, but that still leaves the issue of how to interpret each string, yes? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Crash in GetOldestSnapshot()
> "Amit" == Amit Kapilawrites: Amit> Sure, that is the reason of crash, but even if we do that it will Amit> lead to an error "no known snapshots". Here, what is going on is Amit> that we initialized toast snapshot when there is no active Amit> snapshot in the backend, so GetOldestSnapshot() won't return any Amit> snapshot. Hmm. So this happens because RETURNING queries run to completion immediately and populate a tuplestore with the results, and the portal then fetches from the tuplestore to send to the destination. The assumption is that the tuplestore output can be processed without needing a snapshot, which obviously is not true now if it contains toasted data. In a similar case in the past involving holdable cursors, the solution was to detoast _before_ storing in the tuplestore (see PersistHoldablePortal). I guess the question now is, under what circumstances is it now allowable to detoast a datum with no active snapshot? (Wouldn't it be necessary in such a case to know what the oldest snapshot ever used in the transaction was?) Amit> I think for such situations, we need to initialize the lsn and Amit> whenTaken of ToastSnapshot as we do in GetSnapshotData() [1]. Would that not give a too-recent LSN, resulting in possibly failing to fetch the toast rows? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Crash in GetOldestSnapshot()
> "Andreas" == Andreas Seltenreichwrites: 418 if (OldestActiveSnapshot != NULL) 419 ActiveLSN = OldestActiveSnapshot->as_snap->lsn; 420 421 if (XLogRecPtrIsInvalid(RegisteredLSN) || RegisteredLSN > ActiveLSN) 422 return OldestActiveSnapshot->as_snap; This second conditional should clearly be inside the first one... -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Bruce" == Bruce Momjianwrites: >> As far as I understood Andrew's use case, he was specifically *not* >> interested in a complete representation of an index definition, but >> rather about whether it had certain properties that would be of >> interest to query-constructing applications. Well, I wouldn't limit it to query-constructing applications. I'll give another random example that I thought of. Suppose an administrative GUI (I have no idea if any of the existing GUIs do this) has an option to do CLUSTER on a table; how should it know which indexes to offer the user to cluster on, without access to amclusterable? Bruce> Would it be helpful to output an array of strings representing Bruce> the index definition? Why would that help, if the point is to enable programmatic access to information? Anyway, what I haven't seen in this thread is any implementable counter-proposal other than the "just hardcode the name 'btree'" response that was given in the JDBC thread, which I don't consider acceptable in any sense. Is 9.6 going to go out like this or is action going to be taken before rc1? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus ANALYZE results for an otherwise-unique column with many nulls
> "Tom" == Tom Lanewrites: Tom> Also, the way that the value is calculated in the Tom> samples-not-all-distinct case corresponds to the way I have it in Tom> the patch. Ahh, gotcha. You're referring to this: /* * If we estimated the number of distinct values at more than 10% of * the total row count (a very arbitrary limit), then assume that * stadistinct should scale with the row count rather than be a fixed * value. */ if (stats->stadistinct > 0.1 * totalrows) stats->stadistinct = -(stats->stadistinct / totalrows); where "totalrows" includes nulls obviously. So this expects negative stadistinct to be scaled by the total table size, and the all-distinct case should do the same. Objection withdrawn. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus ANALYZE results for an otherwise-unique column with many nulls
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: >>>>> "Tom" == Tom Lane <t...@sss.pgh.pa.us> writes: Tom> What I did in the patch is to scale the formerly fixed "-1.0" Tom> stadistinct estimate to discount the fraction of nulls we found. Andrew> This seems quite dubious to me. stadistinct representing only Andrew> the non-null values seems to me to be substantially more useful Andrew> and less confusing; it should be up to consumers to take Andrew> stanullfrac into account (in general they already do) since in Andrew> many cases we explicitly do _not_ want to count nulls. Hm. I am wrong about this, since it's the fact that consumers are taking stanullfrac into account that makes the value wrong in the first place. For example, if a million-row table has stanullfrac=0.9 and stadistinct=-1, then get_variable_numdistinct is returning 1 million, and (for example) var_eq_non_const divides 0.1 by that to give a selectivity of 1 in 10 million, which is obviously wrong. But I think the fix is still wrong, because it changes the meaning of ALTER TABLE ... ALTER col SET (n_distinct=...) in a non-useful way; it is no longer possible to nail down a useful negative n_distinct value if the null fraction of the column is variable. Would it not make more sense to do the adjustment in get_variable_numdistinct, instead? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus ANALYZE results for an otherwise-unique column with many nulls
> "Tom" == Tom Lanewrites: Tom> What I did in the patch is to scale the formerly fixed "-1.0" Tom> stadistinct estimate to discount the fraction of nulls we found. This seems quite dubious to me. stadistinct representing only the non-null values seems to me to be substantially more useful and less confusing; it should be up to consumers to take stanullfrac into account (in general they already do) since in many cases we explicitly do _not_ want to count nulls. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wanting to learn about pgsql design decision
> "Tom" == Tom Lanewrites: >> - Why to read from a table, both a usage permission on the schema >> and a read access permission on the table is needed? Tom> Because the SQL standard says so. You'd think, but in fact it doesn't; the spec (at least 2008 and the 2011 drafts) has no concept of grantable permissions on schemas, and ties table ownership and schema ownership together. (See the definition of to see that there's nothing there for schemas, and the definition of for the fact that it's the schema owner who also owns the table and gets the initial grants on it, and and to confirm that only the schema owner can alter or drop the table. The access rules for only require permission on a table column, no mention of schemas.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copyParamList
> "Robert" == Robert Haaswrites: Robert> So I think we instead ought to fix it as in the attached. Robert>if (retval->paramMask != NULL && Robert> - !bms_is_member(i, retval->paramMask)) Robert> + !bms_is_member(i, from->paramMask)) Need to change both references to retval->paramMask, not just one. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
And a doc patch to go with it: -- Andrew (irc:RhodiumToad) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 0689cc9..3e13e38 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -577,6 +577,89 @@ + +Capability information formerly stored in pg_am +is now available via the functions +pg_indexam_has_capability and +pg_index_has_capability +(see ). The following +boolean-valued capability names are currently supported: + + + + Capabilities + + + + + Name + Description + + + + + amcanorder + Does the access method support ordered scans sorted by the + indexed column's value? + + + amcanorderbyop + Does the access method support ordered scans sorted by the result + of an operator on the indexed column? + + + amcanbackward + Does the access method support backward scanning? + + + amcanunique + Does the access method support unique indexes? + + + amcanmulticol + Does the access method support multicolumn indexes? + + + amoptionalkey + Does the access method support a scan without any constraint + for the first index column? + + + amsearcharray + Does the access method support ScalarArrayOpExpr searches? + + + amsearchnulls + Does the access method support IS NULL/NOT NULL searches? + + + amstorage + Can index storage data type differ from column data type? + + + amclusterable + Can an index of this type be clustered on? + + + ampredlocks + Does an index of this type manage fine-grained predicate locks? + + + amgettuple + Does the access method provide an amgettuple function? + + + amgetbitmap + Does the access method provide an amgetbitmap function? + + + amcanreturn + Does the access method support index-only scans? + + + + + diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index baef80d..fd6b983 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16193,6 +16193,14 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); +pg_indexam_has_capability + + + +pg_index_has_capability + + + pg_options_to_table @@ -16380,6 +16388,16 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); number of columns, pretty-printing is implied + pg_indexam_has_capability(am_oid, cap_name) + boolean + Test whether an index access method has a specified capability + + + pg_index_has_capability(index_oid, cap_name) + boolean + Test whether the access method for the specified index has a specified capability + + pg_options_to_table(reloptions) setof record get the set of storage option name/value pairs @@ -16523,6 +16541,16 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); + pg_indexam_has_capability and + pg_index_has_capability return whether the specified + access method, or the access method for the specified index, advertises the + named capability. NULL is returned if the capability + name is not known; true if the capability is advertised, + false if it is not. Refer + to for capability names and their meanings. + + + pg_options_to_table returns the set of storage option name/value pairs (option_name/option_value) when passed -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: Tom> Andrew still hasn't shown a concrete example of what he needs to Tom> do and why. The issue I ran into was the exact same one as in the JDBC thread I linked to earlier: correctly interpreting pg_index.indoption (to get the ASC / DESC and NULLS FIRST/LAST settings), which requires knowing whether amcanorder is true to determine whether to look at the bits at all. The guy I was helping was using an earlier pg version, so it didn't affect him (yet); I hit it when trying to test the query on 9.6. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
Here is my proposed code (first cut; obviously it needs docs too). Opinions? -- Andrew (irc:RhodiumToad) diff --git a/src/backend/access/index/amapi.c b/src/backend/access/index/amapi.c index d347ebc..3e7e084 100644 --- a/src/backend/access/index/amapi.c +++ b/src/backend/access/index/amapi.c @@ -16,7 +16,9 @@ #include "access/amapi.h" #include "access/htup_details.h" #include "catalog/pg_am.h" +#include "catalog/pg_class.h" #include "catalog/pg_opclass.h" +#include "utils/elog.h" #include "utils/syscache.h" @@ -119,3 +121,95 @@ amvalidate(PG_FUNCTION_ARGS) PG_RETURN_BOOL(result); } + + +/* + * Test capabilities of an index or index AM. + */ +static Datum +indexam_capability(FunctionCallInfo fcinfo, + Oid amoid, char *nameptr, int namelen) +{ + IndexAmRoutine *routine = NULL; + bool ret; + + if (namelen < 1 || namelen > 14) + PG_RETURN_NULL(); + + routine = GetIndexAmRoutineByAmId(amoid); + + if (namelen == 10 && memcmp(nameptr, "amcanorder", 10) == 0) + ret = (routine->amcanorder) ? true : false; + else if (namelen == 14 && memcmp(nameptr, "amcanorderbyop", 14) == 0) + ret = (routine->amcanorderbyop) ? true : false; + else if (namelen == 13 && memcmp(nameptr, "amcanbackward", 13) == 0) + ret = (routine->amcanbackward) ? true : false; + else if (namelen == 11 && memcmp(nameptr, "amcanunique", 11) == 0) + ret = (routine->amcanunique) ? true : false; + else if (namelen == 13 && memcmp(nameptr, "amcanmulticol", 13) == 0) + ret = (routine->amcanmulticol) ? true : false; + else if (namelen == 13 && memcmp(nameptr, "amoptionalkey", 13) == 0) + ret = (routine->amoptionalkey) ? true : false; + else if (namelen == 13 && memcmp(nameptr, "amsearcharray", 13) == 0) + ret = (routine->amsearcharray) ? true : false; + else if (namelen == 13 && memcmp(nameptr, "amsearchnulls", 13) == 0) + ret = (routine->amsearchnulls) ? true : false; + else if (namelen == 9 && memcmp(nameptr, "amstorage", 9) == 0) + ret = (routine->amstorage) ? true : false; + else if (namelen == 13 && memcmp(nameptr, "amclusterable", 13) == 0) + ret = (routine->amclusterable) ? true : false; + else if (namelen == 11 && memcmp(nameptr, "ampredlocks", 11) == 0) + ret = (routine->ampredlocks) ? true : false; + else if (namelen == 11 && memcmp(nameptr, "amcanreturn", 11) == 0) + ret = (routine->amcanreturn) ? true : false; + else if (namelen == 10 && memcmp(nameptr, "amgettuple", 10) == 0) + ret = (routine->amgettuple) ? true : false; + else if (namelen == 11 && memcmp(nameptr, "amgetbitmap", 11) == 0) + ret = (routine->amgetbitmap) ? true : false; + else + PG_RETURN_NULL(); + + PG_RETURN_BOOL(ret); +} + +/* + * Test capability of an AM specified by the AM Oid. + */ +Datum +pg_indexam_has_capability(PG_FUNCTION_ARGS) +{ + Oid amoid = PG_GETARG_OID(0); + text *name = PG_GETARG_TEXT_PP(1); + char *nameptr = VARDATA_ANY(name); + int namelen = VARSIZE_ANY_EXHDR(name); + + return indexam_capability(fcinfo, amoid, nameptr, namelen); +} + +/* + * Test capability of the AM for an index specified by relation Oid. + */ +Datum +pg_index_has_capability(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + text *name = PG_GETARG_TEXT_PP(1); + char *nameptr = VARDATA_ANY(name); + int namelen = VARSIZE_ANY_EXHDR(name); + Oid amoid; + HeapTuple tuple; + Form_pg_class rd_rel; + + tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "could not find tuple for relation %u", relid); + rd_rel = (Form_pg_class) GETSTRUCT(tuple); + if (rd_rel->relkind != RELKIND_INDEX) + ereport(ERROR, +(errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("relation %u is not an index", relid))); + amoid = rd_rel->relam; + ReleaseSysCache(tuple); + + return indexam_capability(fcinfo, amoid, nameptr, namelen); +} diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h index 35f1061..0e2f9d7 100644 --- a/src/include/access/amapi.h +++ b/src/include/access/amapi.h @@ -171,4 +171,7 @@ extern IndexAmRoutine *GetIndexAmRoutineByAmId(Oid amoid); extern Datum amvalidate(PG_FUNCTION_ARGS); +extern Datum pg_indexam_has_capability(PG_FUNCTION_ARGS); +extern Datum pg_index_has_capability(PG_FUNCTION_ARGS); + #endif /* AMAPI_H */ diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 5d233e3..4e696fa 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -564,6 +564,11 @@ DESCR("spgist index access method handler"); DATA(insert OID = 335 ( brinhandler PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 325 "2281" _null_ _null_ _null_ _null_ _null_ brinhandler _null_ _null_ _null_ )); DESCR("brin index access method handler"); +DATA(insert OID = 336 ( pg_indexam_has_capability PGNSP PGUID 12 1 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ pg_indexam_has_capability _null_ _null_ _null_ )); +DESCR("test capability of an index access method"); +DATA(insert OID = 337 ( pg_index_has_capability PGNSP PGUID
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: >> It could easily be exposed as a function interface of the form >> index_has_capability(oid,name) or indexam_has_capability(oid,name) >> without any initdb worries. Tom> You missed the "compelling argument why it's needed" part. What Tom> is the need for this? I'm not going to be persuaded by "it was Tom> there before". How about "it was there before, and people did use it"? In fact I notice you participated in a discussion of this a couple of months back on the JDBC list, in which your solution was to suggest hardcoding the name 'btree' into the query: https://www.postgresql.org/message-id/24504.1463237368%40sss.pgh.pa.us Doesn't that strike you as an indication that something is wrong? Tom> We've gotten along fine without such inspection functions for FDWs Tom> and tablesample methods, which are new and not especially interesting to code doing introspection Tom> so I doubt that we really need them for index AMs. People write catalog queries for indexes a whole lot more than they do for FDWs or tablesample methods. This whole discussion started because I wrote a catalog query for someone on IRC, and found I couldn't do it on 9.6 because amcanorder was gone. Tom> Nobody's writing applications that make decisions about which AM Tom> to use based on what they see in pg_am. That's not the issue. The issue is finding information about _existing_ indexes that is not otherwise exposed. Tom> Moreover, I think you are missing the point about initdb. The Tom> issue there is that anytime in future that we make a change to the Tom> AM API, we'd need to have a debate about whether and how to expose Tom> such a change for SQL inspection. Defining the exposure mechanism Tom> as a new function rather than a new view column changes neither Tom> the need for a debate, nor the need for an initdb unless we decide Tom> that we don't need to expose anything. I'm not proposing a new function for each capability. I'm proposing ONE function (or two, one starting from the index rather than the AM, for convenience). Adding more capability names would not require an initdb. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Tom" == Tom Lanewrites: >> With the gutting of pg_am in 9.6, there seems to be no longer any >> way for a query of the system catalogs to discover any of the index >> capabilities that were formerly columns in pg_am (notably >> amcanorder, amcanorderbyop, amclusterable, amsearcharray, >> amsearchnulls). >> Am I missing something or is this a significant oversight? Tom> It's absolutely not an oversight. We asked when 65c5fcd35 went in Tom> whether there was still any need for that information to be Tom> available at the SQL level, and nobody appeared to care. Perhaps you were asking the wrong people? Tom> We could in theory expose a view to show the data --- but since a Tom> large part of the point of that change was to not need initdb for Tom> AM API changes, and to not be constrained to exactly Tom> SQL-compatible representations within that API, I'm disinclined to Tom> do so without a fairly compelling argument why it's needed. It could easily be exposed as a function interface of the form index_has_capability(oid,name) or indexam_has_capability(oid,name) without any initdb worries. That would surely be better than the present condition of being completely unable to get this information from SQL. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] No longer possible to query catalogs for index capabilities?
With the gutting of pg_am in 9.6, there seems to be no longer any way for a query of the system catalogs to discover any of the index capabilities that were formerly columns in pg_am (notably amcanorder, amcanorderbyop, amclusterable, amsearcharray, amsearchnulls). Am I missing something or is this a significant oversight? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
>>>>> "Andrew" == Andrew Gierth <and...@tao11.riddles.org.uk> writes: >>> Whole-row vars when constructed never contain the null value. David> ...but what does this mean in end-user terms? Andrew> It means for example that this query: Andrew> select y from x left join y on (x.id=y.id) where y is null; Andrew> would always return 0 rows. On second thoughts I'll take this one back. Specifying that whole-row vars don't contain the null value when constructed doesn't actually result in no rows, since the construction is logically below the join; and hence even with that rule in place, the query would return a row for each non-matched "x" row. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
> "David" == David G Johnstonwrites: >> Prohibiting IS NOT NULL is not on the cards; it's very widely used. David> Yet changing how it behaves, invisibly, is? Did you mean prohibiting it only for composite-type args? It's obviously widely used for non-composite args. I would expect that >95% of cases where someone has written (x IS NOT NULL) for x being a composite type, it's actually a bug and that NOT (x IS NULL) was intended. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
> "David" == David G Johnstonwrites: >> 1. x IS NULL is true if and only if x has the null value (isnull set). David> I don't have a problem conforming to "ROW(NULL, NULL) IS NULL" David> being true...if you somehow get a hold of something in that David> form, which your others points address. This seems harmless, but I think it's not worth the pain. I'm informed that for example on Oracle: select case when foo_type(null, null) is null then 'true' else 'false' end from dual; returns 'false'. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
> "David" == David G Johnstonwrites: >> 2. x IS NOT NULL if and only if NOT (x IS NULL) David> I would rather prohibit "IS NOT NULL" altogether. If one needs David> to test "NOT (x IS NULL)" they can write it that way. Prohibiting IS NOT NULL is not on the cards; it's very widely used. >> Whole-row vars when constructed never contain the null value. David> ...but what does this mean in end-user terms? It means for example that this query: select y from x left join y on (x.id=y.id) where y is null; would always return 0 rows. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: revert behavior of IS NULL on row types
In light of the fact that it is an endless cause of bugs both in pg and potentially to applications, I propose that we cease attempting to conform to the spec's definition of IS NULL in favour of the following rules: 1. x IS NULL is true if and only if x has the null value (isnull set). 2. x IS NOT NULL if and only if NOT (x IS NULL) 3. ROW() and other row constructors never return the null value. Whole-row vars when constructed never contain the null value. 4. Columns or variables of composite type can (if not declared NOT NULL) contain the null value (isnull set) which is distinct from an all-columns-null value. 5. COALESCE(x,y) continues to return y if and only if x is the null value. (We currently violate the spec here.) (X. Optionally, consider adding new predicates: x IS ALL NULL x IS NOT ALL NULL x IS ALL NOT NULL x IS NOT ALL NOT NULL which would examine the fields of x non-recursively.) Justification: https://www.postgresql.org/message-id/4f6a90a0-c6e8-22eb-3b7a-727f8a60f3b1%40BlueTreble.com https://www.postgresql.org/message-id/20160708024746.1410.57282%40wrigleys.postgresql.org Further rationale: https://www.postgresql.org/message-id/87zip9qti4.fsf%40news-spur.riddles.org.uk -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \timing interval
> "Gavin" == Gavin Flowerwrites: >> How about >> >> Time: 1234567.666 ms (20m 34.6s) Gavin> I like that, but I think the human form should retain the 3 Gavin> decimal places. Scale it. Time: 12.345 ms (0.012345s) Time: 1234.567 ms (1.235s) Time: 98765.432 ms (98.8s) Time: 123456.789 ms (2m 3.5s) Time: 12345678.666 ms (3h 24m 46s) Gavin> In a few years, we may well have enormously multiprocessor Gavin> computers with massive very fast permanent 'RAM' where the Gavin> entire database is always in memory, so timing to the nearest Gavin> microsecond could be useful. But the original microsecond-resolution value is still right there, so I don't see the issue. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \timing interval
> "Tom" == Tom Lanewrites: > Peter Eisentraut writes: >> I'm not quite sure what you mean by wanting to do arithmetic on the >> numbers. My phrasing of the problem is that after a long query, you >> might get output like this: >> Time: 1234567.666 ms >> which is pretty useless. Tom> What I mean by that is that not infrequently, I'll run the same Tom> query several times and then want to average the results. That's Tom> easy with awk or similar scripts as long as the numbers are in Tom> straight decimal. Tom> I don't mind if we provide a way to print in Babylonian-inspired Tom> notation(s) as well, but I'm going to be seriously annoyed if Tom> that's the only way to get the output. How about Time: 1234567.666 ms (20m 34.6s) ? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reserved role names
> "Joshua" == Joshua D Drakewrites: Joshua> Is it intentional that insert and delete are allowed and select Joshua> is not or is it an oversight? Just an artifact of SELECT being fully reserved (needed for (SELECT ...) syntax to work right) while INSERT and DELETE are unreserved. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New design for FK-based join selectivity estimation
> "Tom" == Tom Lanewrites: > Tomas Vondra writes: >> Attached is a reworked patch, mostly following the new design proposal >> from this thread. Tom> Comments and testing appreciated. This blows up (see bug 14219 for testcase) in match_foreign_keys_to_quals on the find_base_rel call(s) in the following case: If the query was produced by rule expansion then the code that populates fkinfo includes FK references to the OLD and NEW RTEs, but those might not appear in the jointree (the testcase for the bug is a DELETE rule where NEW clearly doesn't apply) and hence build_simple_rel was not called (causing find_base_rel to fail). Not sure what the right fix is. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MultiXactId error after upgrade to 9.3.4
> "Robert" == Robert Haaswrites: >> Why is the correct rule not "check for and ignore pre-upgrade mxids >> before even trying to fetch members"? Robert> I entirely believe that's the correct rule, but doesn't Robert> implementing it require a crystal balll? Why would it? Pre-9.3 mxids are identified by the combination of flag bits in the infomask, see Alvaro's patch. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MultiXactId error after upgrade to 9.3.4
> "Alvaro" == Alvaro Herrerawrites: >> (It can, AFAICT, be inside the currently valid range due to >> wraparound, i.e. without there being a valid pg_multixact entry for >> it, because AFAICT in 9.2, once the mxid is hinted dead it is never >> again either looked up or cleared, so it can sit in the tuple xmax >> forever, even through multiple wraparounds.) Alvaro> HeapTupleSatisfiesVacuum removes very old multixacts It does nothing of the kind; it only marks them HEAP_XMAX_INVALID. The actual mxid remains in the tuple xmax field. The failing mxids in the case I analyzed on -bugs are failing _in spite of_ being already hinted HEAP_XMAX_INVALID, because the code path in question doesn't check that. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MultiXactId error after upgrade to 9.3.4
> "Alvaro" == Alvaro Herrerawrites: Alvaro> I think that was a good choice in general so that Alvaro> possibly-data-eating bugs could be reported, but there's a Alvaro> problem in the specific case of tuples carried over by Alvaro> pg_upgrade whose Multixact is "further in the future" compared Alvaro> to the nextMultiXactId counter. I think it's pretty clear that Alvaro> we should let that error be downgraded to DEBUG too, like the Alvaro> other checks. But that leaves an obvious third issue: it's all very well to ignore the pre-upgrade (pre-9.3) mxid if it's older than the cutoff or it's in the future, but what if it's actually inside the currently valid range? Looking it up as though it were a valid mxid in that case seems to be completely wrong and could introduce more subtle errors. (It can, AFAICT, be inside the currently valid range due to wraparound, i.e. without there being a valid pg_multixact entry for it, because AFAICT in 9.2, once the mxid is hinted dead it is never again either looked up or cleared, so it can sit in the tuple xmax forever, even through multiple wraparounds.) Why is the correct rule not "check for and ignore pre-upgrade mxids before even trying to fetch members"? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] regexp_match() returning text
> "Emre" == Emre Hasegeliwrites: Emre> Attached patch adds regexp_match() function which is a simple Emre> variant of regexp_matches() that doesn't return a set. We already have a function that takes a string and a regexp and returns a single text result: substring(). Regexp flags other than 'g' can also be embedded in the regexp: postgres=# select substring('foo bar' from '(?i)BA+'); substring --- ba Returning an array containing the values of all capture groups might be more useful (substring returns the value of the first capture group if any, otherwise the matched string). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
> "Dean" == Dean Rasheedwrites: Dean> That may be so, but we already support FILTER for all windows Dean> functions as well as aggregates: Not so: "If FILTER is specified, then only the input rows for which the filter_clause evaluates to true are fed to the window function; other rows are discarded. Only window functions that are aggregates accept a FILTER clause." (Per spec, FILTER appears only in the production, which is just one of the several options for .) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] copyParamList
copyParamList does not respect from->paramMask, in what looks to me like an obvious oversight: retval->paramMask = NULL; [...] /* Ignore parameters we don't need, to save cycles and space. */ if (retval->paramMask != NULL && !bms_is_member(i, retval->paramMask)) retval->paramMask is never set to anything not NULL in this function, so surely that should either be initializing it to from->paramMask, or checking from->paramMask in the conditional? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow COPY to use parameters
> "Merlin" == Merlin Moncurewrites: Merlin> Note, the biggest pain point I have with COPY is not being able Merlin> to parameterize the filename argument. Second proof of concept attached. This goes so far as to allow statements like: do $$ declare t text := 'bar'; f text := '/tmp/copytest.dat'; begin copy (select t, now()) to (f) csv header; end; $$; Also "copy foo to $1" or "copy (select * from foo where x=$1) to $2" and so on should work from PQexecParams or in a plpgsql EXECUTE. (I haven't tried to parameterize anything other than the filename and query. Also, it does not accept arbitrary expressions - only $n, '...' or a columnref. $n and '...' can have parens or not, but the columnref must have them due to conflicts with unreserved keywords PROGRAM, STDIN, STDOUT. This could be hacked around in other ways, I guess, if the parens are too ugly.) -- Andrew (irc:RhodiumToad) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 3201476..97debb7 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -37,6 +37,8 @@ #include "optimizer/clauses.h" #include "optimizer/planner.h" #include "nodes/makefuncs.h" +#include "nodes/nodeFuncs.h" +#include "parser/analyze.h" #include "rewrite/rewriteHandler.h" #include "storage/fd.h" #include "tcop/tcopprot.h" @@ -279,13 +281,13 @@ static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0"; /* non-export function prototypes */ -static CopyState BeginCopy(bool is_from, Relation rel, Node *raw_query, +static CopyState BeginCopy(bool is_from, Relation rel, Node *raw_query, ParamListInfo params, const char *queryString, const Oid queryRelId, List *attnamelist, List *options); static void EndCopy(CopyState cstate); static void ClosePipeToProgram(CopyState cstate); -static CopyState BeginCopyTo(Relation rel, Node *query, const char *queryString, - const Oid queryRelId, const char *filename, bool is_program, +static CopyState BeginCopyTo(Relation rel, Node *query, ParamListInfo params, const char *queryString, + const Oid queryRelId, Node *filename_expr, bool is_program, List *attnamelist, List *options); static void EndCopyTo(CopyState cstate); static uint64 DoCopyTo(CopyState cstate); @@ -767,6 +769,43 @@ CopyLoadRawBuf(CopyState cstate) } +static char * +CopyEvalFilename(QueryDesc *qd, Node *expr, ParamListInfo params) +{ + char *filename = NULL; + + if (expr) + { + EState *estate = qd ? qd->estate : CreateExecutorState(); + MemoryContext oldcontext = MemoryContextSwitchTo(estate->es_query_cxt); + + Assert(exprType(expr) == CSTRINGOID); + + if (qd == NULL) + estate->es_param_list_info = params; + + { + ExprContext *ecxt = CreateExprContext(estate); + ExprState *exprstate = ExecPrepareExpr(copyObject(expr), estate); + bool isnull; + Datum result = ExecEvalExprSwitchContext(exprstate, ecxt, , NULL); + if (!isnull) +filename = MemoryContextStrdup(oldcontext, DatumGetCString(result)); + FreeExprContext(ecxt, true); + } + + MemoryContextSwitchTo(oldcontext); + + if (qd == NULL) + FreeExecutorState(estate); + + if (!filename) + elog(ERROR, "COPY filename expression must not return NULL"); + } + + return filename; +} + /* * DoCopy executes the SQL COPY statement * @@ -787,7 +826,7 @@ CopyLoadRawBuf(CopyState cstate) * the table or the specifically requested columns. */ Oid -DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed) +DoCopy(const CopyStmt *stmt, const char *queryString, ParamListInfo params, uint64 *processed) { CopyState cstate; bool is_from = stmt->is_from; @@ -906,7 +945,7 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed) select->targetList = list_make1(target); select->fromClause = list_make1(from); - query = (Node *) select; + query = (Node *) parse_analyze((Node *) select, queryString, NULL, 0); /* * Close the relation for now, but keep the lock on it to prevent @@ -929,6 +968,8 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed) if (is_from) { + char *filename; + Assert(rel); /* check read-only transaction and parallel mode */ @@ -936,15 +977,20 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed) PreventCommandIfReadOnly("COPY FROM"); PreventCommandIfParallelMode("COPY FROM"); - cstate = BeginCopyFrom(rel, stmt->filename, stmt->is_program, + filename = CopyEvalFilename(NULL, stmt->filename, params); + + cstate = BeginCopyFrom(rel, filename, stmt->is_program, stmt->attlist, stmt->options); cstate->range_table = range_table; *processed = CopyFrom(cstate); /* copy from file to database */ EndCopyFrom(cstate); + + if (filename) + pfree(filename); } else { - cstate = BeginCopyTo(rel, query, queryString, relid, + cstate = BeginCopyTo(rel, query, params, queryString, relid, stmt->filename, stmt->is_program,