Re: WIP: System Versioned Temporal Table
On Thu, Nov 19, 2020 at 11:04 AM Surafel Temesgen wrote: > > Attached is a rebased one. > regards > Surafel > Thank you for your work on this! The v7 patch fails on the current master branch. Error from make: gram.y:16695:1: error: static declaration of ‘makeAndExpr’ follows non-static declaration makeAndExpr(Node *lexpr, Node *rexpr, int location) ^~~ In file included from gram.y:58:0: ../../../src/include/nodes/makefuncs.h:108:14: note: previous declaration of ‘makeAndExpr’ was here extern Node *makeAndExpr(Node *lexpr, Node *rexpr, int location); ^~~ gram.y:16695:1: warning: ‘makeAndExpr’ defined but not used [-Wunused-function] makeAndExpr(Node *lexpr, Node *rexpr, int location) ^~~ The docs have two instances of "EndtTime" that should be "EndTime". Ryan Lambert
Re: WIP: System Versioned Temporal Table
On Fri, Jan 8, 2021 at 5:34 AM Simon Riggs wrote: > On Fri, Jan 8, 2021 at 7:34 AM Simon Riggs > wrote: > > > > On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs > wrote: > > > > > I've minimally rebased the patch to current head so that it compiles > > > and passes current make check. > > > > Full version attached > > New version attached with improved error messages, some additional > docs and a review of tests. > > The v10 patch fails to make on the current master branch (15b824da). Error: make[2]: Entering directory '/var/lib/postgresql/git/postgresql/src/backend/parser' '/usr/bin/perl' ./check_keywords.pl gram.y ../../../src/include/parser/kwlist.h /usr/bin/bison -Wno-deprecated -d -o gram.c gram.y gram.y:3685.55-56: error: $4 of ‘ColConstraintElem’ has no declared type n->contype = ($4)->contype; ^^ gram.y:3687.56-57: error: $4 of ‘ColConstraintElem’ has no declared type n->raw_expr = ($4)->raw_expr; ^^ gram.y:3734.41-42: error: $$ of ‘generated_type’ has no declared type $$ = n; ^^ gram.y:3741.41-42: error: $$ of ‘generated_type’ has no declared type $$ = n; ^^ gram.y:3748.41-42: error: $$ of ‘generated_type’ has no declared type $$ = n; ^^ ../../../src/Makefile.global:750: recipe for target 'gram.c' failed make[2]: *** [gram.c] Error 1 make[2]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend/parser' Makefile:137: recipe for target 'parser/gram.h' failed make[1]: *** [parser/gram.h] Error 2 make[1]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend' src/Makefile.global:389: recipe for target 'submake-generated-headers' failed make: *** [submake-generated-headers] Error 2 * UPDATE doesn't set EndTime correctly, so err... the patch doesn't > work on this aspect. > Everything else does actually work, AFAICS, so we "just" need a way to > update the END_TIME column in place... > So input from other Hackers/Committers needed on this point to see > what is acceptable. > > * Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved > > * No discussion, comments or tests around freezing and whether that > causes issues here > > * What happens if you ask for a future time? > It will give an inconsistent result as it scans, so we should refuse a > query for time > current_timestamp. * ALTER TABLE needs some work, it's a bit klugey at the moment and > needs extra tests. > Should refuse DROP COLUMN on a system time column > > * Do StartTime and EndTime show in SELECT *? Currently, yes. Would > guess we wouldn't want them to, not sure what standard says. > > I prefer to have them hidden by default. This was mentioned up-thread with no decision, it seems the standard is ambiguous. MS SQL appears to have flip-flopped on this decision [1]. > SELECT * shows the timestamp columns, don't we need to hide the period > > timestamp columns from this query? > > > > > The sql standard didn't dictate hiding the column but i agree hiding it by > default is good thing because this columns are used by the system > to classified the data and not needed in user side frequently. I can > change to that if we have consensus > * The syntax changes in gram.y probably need some coralling > > Overall, it's a pretty good patch and worth working on more. I will > consider a recommendation on what to do with this. > > -- > Simon Riggshttp://www.EnterpriseDB.com/ I am increasingly interested in this feature and have heard others asking for this type of functionality. I'll do my best to continue reviewing and testing. Thanks, Ryan Lambert [1] https://bornsql.ca/blog/temporal-tables-hidden-columns/
Re: WIP: System Versioned Temporal Table
On Fri, Jan 8, 2021 at 11:38 AM Simon Riggs wrote: > On Fri, Jan 8, 2021 at 4:50 PM Ryan Lambert > wrote: > > > > On Fri, Jan 8, 2021 at 5:34 AM Simon Riggs > wrote: > >> > >> On Fri, Jan 8, 2021 at 7:34 AM Simon Riggs < > simon.ri...@enterprisedb.com> wrote: > >> > > >> > On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs < > simon.ri...@enterprisedb.com> wrote: > >> > > >> > > I've minimally rebased the patch to current head so that it compiles > >> > > and passes current make check. > >> > > >> > Full version attached > >> > >> New version attached with improved error messages, some additional > >> docs and a review of tests. > >> > > > > The v10 patch fails to make on the current master branch (15b824da). > Error: > > Updated v11 with additional docs and some rewording of messages/tests > to use "system versioning" correctly. > > No changes on the points previously raised. > > -- > Simon Riggshttp://www.EnterpriseDB.com/ Thank you! The v11 applies and installs. I tried a simple test, unfortunately it appears the versioning is not working. The initial value is not preserved through an update and a new row does not appear to be created. CREATE TABLE t ( id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, v BIGINT NOT NULL ) WITH SYSTEM VERSIONING ; Verify start/end time columns created. t=# \d t Table "public.t" Column | Type | Collation | Nullable | Default ---+--+---+--+-- id| bigint | | not null | generated by default as identity v | bigint | | not null | StartTime | timestamp with time zone | | not null | generated always as row start EndTime | timestamp with time zone | | not null | generated always as row end Indexes: "t_pkey" PRIMARY KEY, btree (id, "EndTime") Add a row and check the timestamps set as expected. INSERT INTO t (v) VALUES (1); SELECT * FROM t; id | v | StartTime | EndTime +---+---+-- 1 | 1 | 2021-01-08 20:56:20.848097+00 | infinity Update the row. UPDATE t SET v = -1; The value for v updated but StartTime is the same. SELECT * FROM t; id | v | StartTime | EndTime ++---+-- 1 | -1 | 2021-01-08 20:56:20.848097+00 | infinity Querying the table for all versions only returns the single updated row (v = -1) with the original row StartTime. The original value has disappeared entirely it seems. SELECT * FROM t FOR SYSTEM_TIME FROM '-infinity' TO 'infinity'; I also created a non-versioned table and later added the columns using ALTER TABLE and encountered the same behavior. Ryan Lambert
Re: Wired if-statement in gen_partprune_steps_internal
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: not tested Spec compliant: not tested Documentation:not tested The original patch still applies and passes make installcheck-world. An updated patch was mentioned but has not been attached. Updating status to Waiting on Author. Cheers, -- Ryan Lambert The new status of this patch is: Waiting on Author
Re: Make Append Cost aware of some run time partition prune case
On Mon, Nov 9, 2020 at 5:44 PM Andy Fan wrote: > Currently the cost model of append path sums the cost/rows for all the > subpaths, it usually works well until we run into the run-time partition > prune > case. The first result is that generic plans will rarely be used for some > cases. > For instance, SELECT * FROM p WHERE pkey = $1; The custom plan will only > count the cost of one partition, however generic plan will count the cost > for all the > partitions even we are sure that only 1 partition will survive. Another > impact > is that planners may choose a wrong plan. for example, SELECT * FROM t1, > p > WHERE t1.a = p.pkey; The cost/rows of t1 nest loop p is estimated highly > improperly. This patch wants to help this case to some extent. > Greetings, I was referred to this patch by Amit as a possible improvement for an issue I noticed recently. I had a test setup where I expected run-time pruning to kick in but it did not. I am trying to test this patch to see if it helps for that scenario, but ran into an error running make install against the current master (commit 0a687c8f1). costsize.c: In function ‘cost_append’: costsize.c:2171:32: error: ‘AppendPath’ {aka ‘struct AppendPath’} has no member named ‘partitioned_rels’ 2171 | List *partitioned_rels = apath->partitioned_rels; |^~ make[4]: *** [: costsize.o] Error 1 make[4]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend/optimizer/path' make[3]: *** [../../../src/backend/common.mk:39: path-recursive] Error 2 make[3]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend/optimizer' make[2]: *** [common.mk:39: optimizer-recursive] Error 2 make[2]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend' make[1]: *** [Makefile:42: install-backend-recurse] Error 2 make[1]: Leaving directory '/var/lib/postgresql/git/postgresql/src' make: *** [GNUmakefile:11: install-src-recurse] Error 2 Thanks, Ryan Lambert
Re: Wired if-statement in gen_partprune_steps_internal
On Wed, Mar 3, 2021 at 11:03 PM Amit Langote wrote: > Sorry, this seems to have totally slipped my mind. > > Attached is the patch I had promised. > > Also, I have updated the title of the CF entry to "Some cosmetic > improvements of partition pruning code", which I think is more > appropriate. > > -- > Amit Langote > EDB: http://www.enterprisedb.com Thank you. The updated patch passes installcheck-world. I ran a handful of test queries with a small number of partitions and observed the same plans before and after the patch. I cannot speak to the quality of the code, though am happy to test any additional use cases that should be verified. Ryan Lambert
Re: WIP: System Versioned Temporal Table
On Mon, Jan 11, 2021 at 7:02 AM Simon Riggs wrote: > On Sat, Jan 9, 2021 at 10:39 AM Simon Riggs > wrote: > > > > On Fri, Jan 8, 2021 at 9:19 PM Ryan Lambert > wrote: > > > > >> Updated v11 with additional docs and some rewording of messages/tests > > >> to use "system versioning" correctly. > > >> > > >> No changes on the points previously raised. > > >> > > > Thank you! The v11 applies and installs. I tried a simple test, > unfortunately it appears the versioning is not working. The initial value > is not preserved through an update and a new row does not appear to be > created. > > > > Agreed. I already noted this in my earlier review comments. > > I'm pleased to note that UPDATE-not-working was a glitch, possibly in > an earlier patch merge. That now works as advertised. > It is working as expected now, Thank you! > I've added fairly clear SGML docs to explain how the current patch > works, which should assist wider review. > The default column names changed to start_timestamp and end_timestamp. A number of places in the docs still refer to StartTime and EndTime. I prefer the new names without MixedCase. > > Also moved test SQL around a bit, renamed some things in code for > readability, but not done any structural changes. > > This is looking much better now... with the TODO/issues list now > looking like this... > > * Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved. > Probably need to add a test that end_timestamp > start_timestamp or ERROR, > which effectively enforces serializability. > > * No discussion, comments or tests around freezing and whether that > causes issues here > > * What happens if you ask for a future time? > It will give an inconsistent result as it scans, so we should refuse a > query for time > current_timestamp. * ALTER TABLE needs some work, it's a bit klugey at the moment and > needs extra tests. > Should refuse DROP COLUMN on a system time column, but currently doesn't > > * UPDATE foo SET start_timestamp = DEFAULT should fail but currently > doesn't > > * Do StartTime and EndTime show in SELECT *? Currently, yes. Would > guess we wouldn't want them to, not sure what standard says. > > From here, the plan would be to set this to "Ready For Committer" in > about a week. That is not the same thing as me saying it is > ready-for-commit, but we need some more eyes on this patch to decide > if it is something we want and, if so, are the code changes cool. > > Should I invest time now into further testing with more production-like scenarios on this patch? Or would it be better to wait on putting effort into that until it has had more review? I don't have much to offer for help on your current todo list. Thanks, Ryan Lambert
Re: WIP: System Versioned Temporal Table
On Thu, Jan 14, 2021 at 2:22 PM Simon Riggs wrote: > On Thu, Jan 14, 2021 at 5:46 PM Surafel Temesgen > wrote: > > > On Fri, Jan 8, 2021 at 7:50 PM Ryan Lambert > wrote: > >> > >> I prefer to have them hidden by default. This was mentioned up-thread > with no decision, it seems the standard is ambiguous. MS SQL appears to > have flip-flopped on this decision [1]. > > I think the default should be like this: > > SELECT * FROM foo FOR SYSTEM_TIME AS OF ... > should NOT include the Start and End timestamp columns > because this acts like a normal query just with a different snapshot > timestamp > > SELECT * FROM foo FOR SYSTEM_TIME BETWEEN x AND y > SHOULD include the Start and End timestamp columns > since this form of query can include multiple row versions for the > same row, so it makes sense to see the validity times > +1 Ryan Lambert
Re: Wired if-statement in gen_partprune_steps_internal
Should the status of this patch be updated to ready for comitter to get in line for Pg 14 deadline? *Ryan Lambert* On Sun, Mar 7, 2021 at 11:38 PM Amit Langote wrote: > On Fri, Mar 5, 2021 at 7:50 AM Ryan Lambert > wrote: > > On Wed, Mar 3, 2021 at 11:03 PM Amit Langote > wrote: > >> > >> Sorry, this seems to have totally slipped my mind. > >> > >> Attached is the patch I had promised. > >> > >> Also, I have updated the title of the CF entry to "Some cosmetic > >> improvements of partition pruning code", which I think is more > >> appropriate. > > > > Thank you. The updated patch passes installcheck-world. I ran a > handful of test queries with a small number of partitions and observed the > same plans before and after the patch. I cannot speak to the quality of the > code, though am happy to test any additional use cases that should be > verified. > > Thanks Ryan. > > There's no need to test it extensively, because no functionality is > changed with this patch. > > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Re: pg_dump new feature: exporting functions only. Bad or good idea ?
On Sat, Mar 27, 2021 at 6:23 AM Lætitia Avrot wrote: > Hello, > > You'll find enclosed the first version of my patch. > I tested a couple simple use cases. This is great, Thank you! > I did not include the possibility of using a file to list tables to be > exported as Tom suggested because I genuinely think it is a totally > different matter. It does not mean I'm not open to the possibility, it just > felt weird. > > The patch allows using a `--functions-only` flag in `pg_dump` to export > only functions and stored procedures. My code was build and passed tests on > the last master branch of the PostgreSQL project. I added regression tests. > Documentation has been updated too and generation of the documentation > (HTML, man page, pdf in A4 and letter US format) has been tested > successfully. > > I did not add a warning in the documentation that the file provided might > end up in a not restorable file or in a file restoring broken functions or > procedures. Do you think I should? > The docs for both the --table and --schema options do warn about this. On the other hand, --data-only has no such warning. I'd lean towards matching --data-only for this. > > I don't know if this patch has any impact on performance. I guess that > adding 4 if statements will slow down `pg_dump` a little bit. > > Have a nice day, > > Lætitia > Using --functions-only along with --table= does not error out and warn the user, instead it creates a dump containing only the SET commands. An error similar to using --functions-only along with --data-only seems like a good idea. Cheers, *Ryan Lambert* RustProof Labs
Re: dropdb --force
Hi Pavel, I took a quick look through the patch, I'll try to build and test it tomorrow. --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -3145,6 +3145,7 @@ typedef struct DropdbStmt NodeTag type; char *dbname; /* database to drop */ bool missing_ok; /* skip error if db is missing? */ + List *options; /* currently only FORCE is supported */ } DropdbStmt; Why put FORCE as the single item in an options list? A bool var seems like it would be more clear and consistent. - * DROP DATABASE [ IF EXISTS ] + * DROP DATABASE [ ( FORCE ) ] [ IF EXISTS ] Why is it `[ ( FORCE ) ]` instead of `[ FORCE ]`? There are also places in the code that seem like extra () are around FORCE. Like here: + appendPQExpBuffer(&sql, "DROP DATABASE %s%s%s;", + (force ? " (FORCE) " : ""), + (if_exists ? "IF EXISTS " : ""), fmtId(dbname)); And here: +$node->safe_psql('postgres', 'CREATE DATABASE foobar2'); +$node->issues_sql_like( + [ 'dropdb', '--force', 'foobar2' ], + qr/statement: DROP DATABASE (FORCE) foobar2/, + 'SQL DROP DATABASE (FORCE) run'); + I'll try to build and test the patch tomorrow. Thanks, *Ryan Lambert* > >>
Re: FETCH FIRST clause PERCENT option
On Thu, Sep 19, 2019 at 6:52 AM Surafel Temesgen wrote: > Hi Tom, > In the attached patch i include the comments given > > regards > Surafel > Patch v9 applies and passes make installcheck-world. > From: Tom Lane > Date: 2019-09-05 22:26:29 > * I didn't really study the changes in nodeLimit.c, but doing > "tuplestore_rescan" in ExecReScanLimit is surely just wrong. You > probably want to delete and recreate the tuplestore, instead, since > whatever data you already collected is of no further use. Maybe, in > the case where no rescan of the child node is needed, you could re-use > the data already collected; but that would require a bunch of additional > logic. I'm inclined to think that v1 of the patch shouldn't concern > itself with that sort of optimization. I don't think this was addressed. Ryan Lambert >
Re: FETCH FIRST clause PERCENT option
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested The latest patch applies cleanly and passes all tests. I believe all feedback has been addressed except for the one case that can be improved upon in later work. Updating status as ready for committer. Ryan Lambert The new status of this patch is: Ready for Committer
Fix XML handling with DOCTYPE
Hi all, I'm investigating the issue I reported here: https://www.postgresql.org/message-id/flat/153478795159.1302.9617586466368699403%40wrigleys.postgresql.org As Tom Lane mentioned there, the docs (8.13) indicate xmloption = CONTENT should accept all valid XML. At this time, XML with a DOCTYPE declaration is not accepted with this setting even though it is considered valid XML. I'd like to work on a patch to address this issue and make it work as advertised. I traced the source of the error to line ~1500 in /src/backend/utils/adt/xml.c res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0, utf8string + count, NULL); It looks like it is xmlParseBalancedChunkMemory from libxml that doesn't work when there's a DOCTYPE in the XML data. My assumption is the DOCTYPE element makes the XML not well-balanced. From: http://xmlsoft.org/html/libxml-parser.html#xmlParseBalancedChunkMemory This function returns: > 0 if the chunk is well balanced, -1 in case of args problem and the parser > error code otherwise I see xmlParseBalancedChunkMemoryRecover that might provide the functionality needed. That function returns: 0 if the chunk is well balanced, -1 in case of args problem and the parser > error code otherwise In case recover is set to 1, the nodelist will not be > empty even if the parsed chunk is not well balanced, assuming the parsing > succeeded to some extent. I haven't tested yet to see if this parses the data w/ DOCTYPE successfully yet. If it does, I don't think it would be difficult to update the check on res_code to not fail. I'm making another assumption that there is a distinct code from libxml to differentiate from other errors, but I couldn't find those codes quickly. The current check is this: if (res_code != 0 || xmlerrcxt->err_occurred) Does this sound reasonable? Have I missed some major aspect? If this is on the right track I can work on creating a patch to move this forward. Thanks, *Ryan Lambert* RustProof Labs www.rustprooflabs.com
Re: Fix XML handling with DOCTYPE
Thank you both! I had glanced at that item in the commitfest but didn't notice it would fix this issue. I'll try to test/review this before the end of the month, much better than starting from scratch myself. A quick glance at the patch looks logical and looks like it should work for my use case. Thanks, Ryan Lambert On Sat, Mar 16, 2019 at 4:33 PM Chapman Flack wrote: > On 03/16/19 17:21, Tom Lane wrote: > > Chapman Flack writes: > >> On 03/16/19 16:55, Tom Lane wrote: > >>> What do you think of the idea I just posted about parsing off the > DOCTYPE > >>> thing for ourselves, and not letting libxml see it? > > > >> The principled way of doing that would be to pre-parse to find a > DOCTYPE, > >> and if there is one, leave it there and parse the input as we do for > >> 'document'. Per XML, if there is a DOCTYPE, the document must satisfy > >> the 'document' syntax requirements, and per SQL/XML:2006-and-later, > >> 'content' is a proper superset of 'document', so if we were asked for > >> 'content' and can successfully parse it as 'document', we're good, > >> and if we see a DOCTYPE and yet it incurs a parse error as 'document', > >> well, that's what needed to happen. > > > > Hm, so, maybe just > > > > (1) always try to parse as document. If successful, we're done. > > > > (2) otherwise, if allowed by xmloption, try to parse using our > > current logic for the CONTENT case. > > What I don't like about that is that (a) the input could be > arbitrarily long and complex to parse (not that you can't imagine > a database populated with lots of short little XML snippets, but > at the same time, a query could quite plausibly deal in yooge ones), > and (b), step (1) could fail at the last byte of the input, followed > by total reparsing as (2). > > I think the safer structure is clearly that of the current patch, > modulo whether the "has a DOCTYPE" test is done by libxml itself > (with the assumptions you don't like) or by a pre-scan. > > So the current structure is: > > restart: > asked for document? > parse as document, or fail > else asked for content: > parse as content > failed? > because DOCTYPE? restart as if document > else fail > > and a pre-scan structure could be very similar: > > restart: > asked for document? > parse as document, or fail > else asked for content: > pre-scan finds DOCTYPE? > restart as if document > else parse as content, or fail > > The pre-scan is a simple linear search and will ordinarily say yes or no > within a couple dozen characters--you could *have* an input with 20k of > leading whitespace and comments, but it's hardly the norm. Just trying to > parse as 'document' first could easily parse a large fraction of the input > before discovering it's followed by something that can't follow a document > element. > > Regards, > -Chap >
Re: Fix XML handling with DOCTYPE
I am unable to get latest patches I found [1] to apply cleanly to current branches. It's possible I missed the latest patches so if I'm using the wrong ones please let me know. I tried against master, 11.2 stable and the 11.2 tag with similar results. It's quite possible it's user error on my end, I am new to this process but didn't have issues with the previous patches when I tested those a couple weeks ago. [1] https://www.postgresql.org/message-id/5c8ecaa4.3090...@anastigmatix.net Ryan Lambert On Sat, Mar 23, 2019 at 5:07 PM Chapman Flack wrote: > On 03/23/19 18:22, Tom Lane wrote: > >> Out of curiosity, what further processing would you expect libxml to do? > > > > Hm, I'd have thought it'd try to parse the arguments to some extent, > > but maybe not. Does everybody reimplement attribute parsing for > > themselves when using PIs? > > Yeah, the content of a PI (whatever's after the target name) is left > all to be defined by whatever XML-using application might care about > that PI. > > It could have an attribute=value syntax inspired by XML elements, or > some other form entirely, but there'd just better not be any ?> in it. > > Regards, > -Chap >
Re: Fix XML handling with DOCTYPE
Perfect, thank you! I do remember seeing that message now, but hadn't understood what it really meant. I will test later today. Thanks *Ryan* On Sun, Mar 24, 2019 at 9:19 PM Tom Lane wrote: > Chapman Flack writes: > > On 03/24/19 21:04, Ryan Lambert wrote: > >> I am unable to get latest patches I found [1] to apply cleanly to > current > >> branches. It's possible I missed the latest patches so if I'm using the > >> wrong ones please let me know. I tried against master, 11.2 stable and > the > >> 11.2 tag with similar results. > > > Tom pushed the content-with-DOCTYPE patch, it's now included in master, > > REL_11_STABLE, REL_10_STABLE, REL9_6_STABLE, REL9_5_STABLE, and > > REL9_4_STABLE. > > Right. If you want to test (and please do!) you could grab the relevant > branch tip from our git repo, or download a nightly snapshot tarball from > > https://www.postgresql.org/ftp/snapshot/ > > regards, tom lane >
Re: Fix XML handling with DOCTYPE
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested I tested the master branch (commit 8edd0e7), REL_11_STABLE (commit 24df866) and REL9_6_STABLE (commit 5097368) and verified functionality. This patch fixes the bug I had reported [1] previously. With this in the stable branches is it safe to assume this will be included with the next minor releases? Thanks for your work on this!! Ryan [1] https://www.postgresql.org/message-id/flat/153478795159.1302.9617586466368699403%40wrigleys.postgresql.org The new status of this patch is: Ready for Committer
Re: Fix XML handling with DOCTYPE
Ok, I'll give it a go. > If you happened to feel moved to look over a documentation patch, that > would be what this CF entry most needs in the waning days of the > commitfest. Is the xml-functions-type-docfix-4.patch [1] the one needing review? I'll test applying it and review the changes in better detail. Is there a section in the docs that shows how to verify if the updated pages render properly? I would assume the pages are build when installing from source. Ryan [1] https://www.postgresql.org/message-id/attachment/100016/xml-functions-type-docfix-4.patch On Mon, Mar 25, 2019 at 4:52 PM Chapman Flack wrote: > On 03/25/19 18:03, Ryan Lambert wrote: > > The following review has been posted through the commitfest application: > > make installcheck-world: tested, passed > > Implements feature: tested, passed > > Spec compliant: not tested > > Documentation:not tested > > Hi, > > Thanks for the review! Yes, that part of this commitfest entry has been > committed already and will appear in the next minor releases of those > branches. > > That leaves only one patch in this commitfest entry that is still in > need of review, namely the update to the documentation. > > If you happened to feel moved to look over a documentation patch, that > would be what this CF entry most needs in the waning days of the > commitfest. > > There seem to be community members reluctant to review it because of not > feeling sufficiently expert in XML to scrutinize every technical detail, > but there are other valuable angles for documentation review. (And the > reason there *is* a documentation patch is the plentiful room for > improvement in the documentation that's already there, so as far as > reviewing goes, the old yarn about the two guys, the running shoes, and > the bear comes to mind.) > > I can supply pointers to specs, etc., for anyone who does see some > technical > details in the patch and has questions about them. > > Regards, > -Chap >
Re: Fix XML handling with DOCTYPE
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, passed Overall I think this patch [1] improves the docs and help explain edge case functionality that many of us, myself included, don't fully understand. I can't verify technical accuracy for many of the details (nuances between XPath 1.0, et. al), but overall my experience with the XML functionality lines up with what has been documented here. Adding the clear declaration of "XPath 1.0" instead of the generic "XPath" helps make it clear of the functional differences and helps frame the differences for new users. I have two recommendations for features.sgml. You state: > relies on the libxml library Should this be clarified as the libxml2 library? That's what I installed to build postgres from source (Ubuntu 16/18). If it is the libxml library and the "2" is irrelevant, or it it works with either, it might be nice to have a clarifying note to indicate that. There are a few places where the parenthesis around a block of text seem unnecessary. I don't think parens serve a purpose when a full sentence is contained within. > (The libxml library does seem to always return nodesets to PostgreSQL with > their members in the same relative order they had in the input document; it > does not commit to this behavior, and an XPath 1.0 expression cannot control > it.) It seems you are standardizing from "node set" to "nodeset", is that the preferred nomenclature or preference? Hopefully this helps! Thanks, Ryan Lambert [1] https://www.postgresql.org/message-id/attachment/100016/xml-functions-type-docfix-4.patch The new status of this patch is: Waiting on Author
Re: Fix XML handling with DOCTYPE
Thanks for putting up with a new reviewer! --with-libxml is the PostgreSQL configure option to make it use libxml2. > > The very web page http://xmlsoft.org/index.html says "The XML C parser > and toolkit of Gnome: libxml" and is all about libxml2. > > So I think I was unsure what convention to follow, and threw up my hands > and went with libxml. I could just as easily throw them up and go with > libxml2. Which do you think would be better? I think leaving it as libxml makes sense with all that. Good point that --with-libxml is used to build so I think staying with that works and is consistent. I agree that having this point included does clarify the how and why of the limitations of this implementation. I also over-parenthesize so I'm used to looking for that in my own writing. The full sentences were the ones that seemed excessive to me, I think the others are ok and I won't nit-pick either way on those (unless you want me to!). If you agree, I should go through and fix my nodesets to be node-sets. Yes, I like node-sets better, especially knowing it conforms to the spec's language. Thanks, *Ryan Lambert* On Tue, Mar 26, 2019 at 11:05 PM Chapman Flack wrote: > On 03/26/19 21:39, Ryan Lambert wrote: > > The following review has been posted through the commitfest application: > > make installcheck-world: tested, passed > > Implements feature: tested, passed > > Spec compliant: not tested > > Documentation:tested, passed > > Thanks for the review! > > > I have two recommendations for features.sgml. You state: > > > >> relies on the libxml library > > > > Should this be clarified as the libxml2 library? That's what I installed > > to build postgres from source (Ubuntu 16/18). If it is the libxml > library > > and the "2" is irrelevant > > That's a good catch. I'm not actually sure whether there is any "libxml" > library that isn't libxml2. Maybe there was once and nobody admits to > hanging out with it. Most Google hits on "libxml" seem to be modules > that have libxml in their names and libxml2 as their actual dependency. > > Perl XML:LibXML: "This module is an interface to libxml2" > Haskell libxml: "Binding to libxml2" > libxml-ruby: "The Libxml-Ruby project provides Ruby language bindings > for the GNOME Libxml2 ..." > > --with-libxml is the PostgreSQL configure option to make it use libxml2. > > The very web page http://xmlsoft.org/index.html says "The XML C parser > and toolkit of Gnome: libxml" and is all about libxml2. > > So I think I was unsure what convention to follow, and threw up my hands > and went with libxml. I could just as easily throw them up and go with > libxml2. Which do you think would be better? > > On 03/26/19 23:52, Tom Lane wrote: > > Do we need to mention that at all? If you're not building from source, > > it doesn't seem very interesting ... but maybe I'm missing some reason > > why end users would care. > > The three places I've mentioned it were the ones where I thought users > might care: > > - why are we stuck at XPath 1.0? It's what we get from the library we use. > > - in what order do we get things out from a (hmm) node-set? Per XPath 1.0, >it's indeterminate (it's a set!), unlike XPath 2.0/XQuery where there's >a sequence type and you have order control. Observable behavior from >libxml2 (and you could certainly want to know this) is you get things > out >in document order, whether that's what you wanted or not, even though >this is undocumented, and even counter-documented[1], libxml2 behavior. >So it's an example of something you would fundamentally like to know, >where the only available answer depends precariously on the library >we happen to be using. > > - which limits in our implementation are inherent to the library, and >which are just current limits in our embedding of it? (Maybe this is >right at the border of what a user would care to know, but I know it's >a question that crosses my mind when I bonk into a limit I wasn't >expecting.) > > > There are a few places where the parenthesis around a block of text > > seem unnecessary. > > )blush( that's a long-standing wart in my writing ... seems I often think > in parentheses, then look and say "those aren't needed" and take them out, > only sometimes I don't. > > I skimmed just now and found a few instances of parenthesized whole > sentence: the one you quoted, and some (if argument is null, the
Re: Fix XML handling with DOCTYPE
I applied and reviewed xml-functions-type-docfix-6.patch. Looks good to me. I like the standardization (e.g. libxml2, node-set) and I didn't catch any spots that used the other versions. I agree that the is appropriate for that block. It also looks like you incorporated Alvaro's feedback about sorting, or the lack thereof. Let me know if there's anything else I can do to help get this accepted. Thanks, Ryan On Thu, Mar 28, 2019 at 5:45 PM Chapman Flack wrote: > On 03/27/19 19:27, Chapman Flack wrote: > > A column marked FOR ORDINALITY will be populated with row numbers > > matching the order in which the output rows appeared in the original > > input XML document. > > > > I've been skimming right over it all this time, and that right there is > > a glaring built-in reliance on the observable-but-disclaimed iteration > > order of a libxml2 node-set. > > So, xml-functions-type-docfix-6.patch. > > I changed that language to say "populated with row numbers, starting > with 1, in the order of nodes retrieved from the row_expression's > result node-set." > > That's not such a terrible thing to have to say; in fact, it's the > *correct* description for the standard, XQuery-based, XMLTABLE (where > the language gives you control of the result sequence's order). > > I followed that with a short note saying since XPath 1.0 doesn't > specify that order, relying on it is implementation-dependent, and > linked to the existing Appendix D discussion. > > I would have like to link directly to the , but of course > doesn't know what to call that, so I linked to the > instead. > > Regards, > -Chap >
Re: dropdb --force
Hello, This is a feature I have wanted for a long time, thank you for your work on this. The latest patch [1] applied cleanly for me. In dbcommands.c the comment references a 5 second delay, I don't see where that happens, am I missing something? I tested both the dropdb program and the in database commands. Without FORCE I get the expected error message about active connections. postgres=# DROP DATABASE testdb; ERROR: source database "testdb" is being accessed by other users DETAIL: There is 1 other session using the database. With FORCE the database drops cleanly. postgres=# DROP DATABASE testdb FORCE; DROP DATABASE The active connections get terminated as expected. Thanks, [1] https://www.postgresql.org/message-id/attachment/99536/drop-database-force-20190310_01.patch *Ryan Lambert* RustProof Labs On Sun, Mar 10, 2019 at 12:54 PM Filip Rembiałkowski < filip.rembialkow...@gmail.com> wrote: > Thank you. Updated patch attached. > > On Sat, Mar 9, 2019 at 2:53 AM Thomas Munro > wrote: > > > > On Wed, Mar 6, 2019 at 1:39 PM Filip Rembiałkowski > > wrote: > > > Here is Pavel's patch rebased to master branch, added the dropdb > > > --force option, a test case & documentation. > > > > Hello, > > > > cfbot.cputube.org says this fails on Windows, due to a missing > semicolon here: > > > > #ifdef HAVE_SETSID > > kill(-(proc->pid), SIGTERM); > > #else > > kill(proc->pid, SIGTERM) > > #endif > > > > The test case failed on Linux, I didn't check why exactly: > > > > Test Summary Report > > --- > > t/050_dropdb.pl (Wstat: 65280 Tests: 13 Failed: 2) > > Failed tests: 12-13 > > Non-zero exit status: 255 > > Parse errors: Bad plan. You planned 11 tests but ran 13. > > > > +/* Time to sleep after isuing SIGTERM to backends */ > > +#define TERMINATE_SLEEP_TIME 1 > > > > s/isuing/issuing/ > > > > But, hmm, this macro doesn't actually seem to be used in the patch. > > Wait, is that because the retry loop forgot to actually include the > > sleep? > > > > +/* without "force" flag raise exception immediately, or after > > 5 minutes */ > > > > Normally we call it an "error", not an "exception". > > > > -- > > Thomas Munro > > https://enterprisedb.com >
Re: Built-in connection pooler
it's more a lack of understanding the **why** regarding how it will operate. postgres=# select * from pg_pooler_state(); pid | n_clients | n_ssl_clients | n_pools | n_backends | n_dedicated_backends | tx_bytes | rx_bytes | n_transactions --+---+---+-++--+---+---+ 1682 |75 | 0 | 1 | 10 | 0 | 366810458 | 353181393 |5557109 1683 |75 | 0 | 1 | 10 | 0 | 368464689 | 354778709 |5582174 (2 rows I am not sure how I feel about this: "Non-tainted backends are not terminated even if there are no more connected sessions." Would it be possible (eventually) to monitor connection rates and free up non-tainted backends after a time? The way I'd like to think of that working would be: If 50% of backends are unused for more than 1 hour, release 10% of established backends. The two percentages and time frame would ideally be configurable, but setup in a way that it doesn't let go of connections too quickly, causing unnecessary expense of re-establishing those connections. My thought is if there's one big surge of connections followed by a long period of lower connections, does it make sense to keep those extra backends established? I'll give the documentation another pass soon. Thanks for all your work on this, I like what I'm seeing so far! [1] https://www.postgresql.org/message-id/attachment/102732/builtin_connection_proxy-10.patch [2] http://richyen.com/postgres/2019/06/25/pools_arent_just_for_cars.html [3] https://www.postgresql.org/message-id/ede4470a-055b-1389-0bbd-840f0594b758%40postgrespro.ru Thanks, Ryan Lambert On Fri, Jul 19, 2019 at 3:10 PM Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > > On 19.07.2019 6:36, Ryan Lambert wrote: > > Here's what I found tonight in your latest patch (9). The output from git > apply is better, fewer whitespace errors, but still getting the following. > Ubuntu 18.04 if that helps. > > git apply -p1 < builtin_connection_proxy-9.patch > :79: tab in indent. > Each proxy launches its own subset of backends. > :634: indent with spaces. > union { > :635: indent with spaces. >struct sockaddr_in inaddr; > :636: indent with spaces. >struct sockaddr addr; > :637: indent with spaces. > } a; > warning: squelched 54 whitespace errors > warning: 59 lines add whitespace errors. > > > A few more minor edits. In config.sgml: > > "If the max_sessions limit is reached new connection > are not accepted." > Should be "connections". > > "The default value is 10, so up to 10 backends will server each database," > "sever" should be "serve" and the sentence should end with a period > instead of a comma. > > > In postmaster.c: > > /* The socket number we are listening for poolled connections on */ > "poolled" --> "pooled" > > > "(errmsg("could not create listen socket for locahost")));" > > "locahost" -> "localhost". > > > " * so to support order balancing we should do dome smart work here." > > "dome" should be "some"? > > I don't see any tests covering this new functionality. It seems that this > is significant enough functionality to warrant some sort of tests, but I > don't know exactly what those would/should be. > > > Thank you once again for this fixes. > Updated patch is attached. > > Concerning testing: I do not think that connection pooler needs some kind > of special tests. > The idea of built-in connection pooler is that it should be able to handle > all requests normal postgres can do. > I have added to regression tests extra path with enabled connection > proxies. > Unfortunately, pg_regress is altering some session variables, so it > backend becomes tainted and so > pooling is not actually used (but communication through proxy is tested). > > It is also possible to run pg_bench with different number of connections > though connection pooler. > > > Thanks, > Ryan > > > pg_conn_pool_notes_3.md Description: Binary data
Re: Built-in connection pooler
> I attached new version of the patch with fixed indentation problems and > Win32 specific fixes. Great, this latest patch applies cleanly to master. installcheck world still passes. > "connections_proxies" is used mostly to toggle connection pooling. > Using more than 1 proxy is be needed only for huge workloads (hundreds > connections). My testing showed using only one proxy performing very poorly vs not using the pooler, even at 300 connections, with -3% TPS. At lower numbers of connections it was much worse than other configurations I tried. I just shared my full pgbench results [1], the "No Pool" and "# Proxies 2" data is what I used to generate the charts I previously shared. The 1 proxy and 10 proxy data I had referred to but hadn't shared the results, sorry about that. > And "session_pool_size" is core parameter which determine efficiency of > pooling. > The main trouble with it now, is that it is per database/user > combination. Each such combination will have its own connection pool. > Choosing optimal value of pooler backends is non-trivial task. It > certainly depends on number of available CPU cores. > But if backends and mostly disk-bounded, then optimal number of pooler > worker can be large than number of cores. I will do more testing around this variable next. It seems that increasing session_pool_size for connection_proxies = 1 might help and leaving it at its default was my problem. > PgPRO EE version of connection pooler has "idle_pool_worker_timeout" > parameter which allows to terminate idle workers. +1 > It is possible to implement it also for vanilla version of pooler. But > primary intention of this patch was to minimize changes in Postgres core Understood. I attached a patch to apply after your latest patch [2] with my suggested changes to the docs. I tried to make things read smoother without altering your meaning. I don't think the connection pooler chapter fits in The SQL Language section, it seems more like Server Admin functionality so I moved it to follow the chapter on HA, load balancing and replication. That made more sense to me looking at the overall ToC of the docs. Thanks, [1] https://docs.google.com/spreadsheets/d/11XFoR26eiPQETUIlLGY5idG3fzJKEhuAjuKp6RVECOU [2] https://www.postgresql.org/message-id/attachment/102848/builtin_connection_proxy-11.patch *Ryan* > builtin_connection_proxy-docs-1.patch Description: Binary data
Re: FETCH FIRST clause PERCENT option
Surafel, The patch did not did it automatically. Its query writer obligation to do > that currently Ok. Your latest patch [1] passes make installcheck-world, I didn't test the actual functionality this round. plan = (Plan *) make_limit(plan, > subparse->limitOffset, > - subparse->limitCount); > + subparse->limitCount, > + subparse->limitOption); > I assume the limit percentage number goes into subparse->limitCount? If so, I don't see that documented. Or does this truly only store the count? The remainder of the code seems to make sense. I attached a patch with a few minor changes in the comments. I need to go back to my notes and see if I covered all the testing I had thought of, I should get to that later this week. [1] https://www.postgresql.org/message-id/attachment/103028/percent-incremental-v6.patch *Ryan Lambert* > percent-incremental-v6-comment-cleanup.patch Description: Binary data
Re: dropdb --force
I set the status to Waiting on Author since Tom's concerns [1] have not been addressed. [1] https://www.postgresql.org/message-id/15707.1564024305%40sss.pgh.pa.us Thanks, Ryan
Re: Built-in connection pooler
it starts having issues. While I don't have expectations of this working great (or even decent) on a tiny server, I don't expect it to crash in a case where the standard connections work. Also, the logs and the function both show that the total backends is less than the total available and the two don't seem to agree on the details. I think it would help to have details about the pg_pooler_state function added to the docs, maybe in this section [2]? I'll take some time later this week to examine pg_pooler_state further on a more appropriately sized server. Thanks, [1] https://www.postgresql.org/message-id/attachment/103046/builtin_connection_proxy-16.patch [2] https://www.postgresql.org/docs/current/functions-info.html Ryan Lambert >
Re: Built-in connection pooler
> First of all default value of this parameter is 1000, not 10. Oops, my bad! Sorry about that, I'm not sure how I got that in my head last night but I see how that would make it act strange now. I'll adjust my notes before re-testing. :) Thanks, *Ryan Lambert* On Wed, Aug 7, 2019 at 4:57 AM Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > Hi Ryan, > > On 07.08.2019 6:18, Ryan Lambert wrote: > > Hi Konstantin, > > > > I did some testing with the latest patch [1] on a small local VM with > > 1 CPU and 2GB RAM with the intention of exploring pg_pooler_state(). > > > > Configuration: > > > > idle_pool_worker_timeout = 0 (default) > > connection_proxies = 2 > > max_sessions = 10 (default) > > max_connections = 1000 > > > > Initialized pgbench w/ scale 10 for the small server. > > > > Running pgbench w/out connection pooler with 300 connections: > > > > pgbench -p 5432 -c 300 -j 1 -T 60 -P 15 -S bench_test > > starting vacuum...end. > > progress: 15.0 s, 1343.3 tps, lat 123.097 ms stddev 380.780 > > progress: 30.0 s, 1086.7 tps, lat 155.586 ms stddev 376.963 > > progress: 45.1 s, 1103.8 tps, lat 156.644 ms stddev 347.058 > > progress: 60.6 s, 652.6 tps, lat 271.060 ms stddev 575.295 > > transaction type: > > scaling factor: 10 > > query mode: simple > > number of clients: 300 > > number of threads: 1 > > duration: 60 s > > number of transactions actually processed: 63387 > > latency average = 171.079 ms > > latency stddev = 439.735 ms > > tps = 1000.918781 (including connections establishing) > > tps = 1000.993926 (excluding connections establishing) > > > > > > It crashes when I attempt to run with the connection pooler, 300 > > connections: > > > > pgbench -p 6543 -c 300 -j 1 -T 60 -P 15 -S bench_test > > starting vacuum...end. > > connection to database "bench_test" failed: > > server closed the connection unexpectedly > >This probably means the server terminated abnormally > >before or while processing the request. > > > > In the logs I get: > > > > WARNING: PROXY: Failed to add new client - too much sessions: 18 > > clients, 1 backends. Try to increase 'max_sessions' configuration > > parameter. > > > > The logs report 1 backend even though max_sessions is the default of > > 10. Why is there only 1 backend reported? Is that error message > > getting the right value? > > > > Minor grammar fix, the logs on this warning should say "too many > > sessions" instead of "too much sessions." > > > > Reducing pgbench to only 30 connections keeps it from completely > > crashing but it still does not run successfully. > > > > pgbench -p 6543 -c 30 -j 1 -T 60 -P 15 -S bench_test > > starting vacuum...end. > > client 9 aborted in command 1 (SQL) of script 0; perhaps the backend > > died while processing > > client 11 aborted in command 1 (SQL) of script 0; perhaps the backend > > died while processing > > client 13 aborted in command 1 (SQL) of script 0; perhaps the backend > > died while processing > > ... > > ... > > progress: 15.0 s, 5734.5 tps, lat 1.191 ms stddev 10.041 > > progress: 30.0 s, 7789.6 tps, lat 0.830 ms stddev 6.251 > > progress: 45.0 s, 8211.3 tps, lat 0.810 ms stddev 5.970 > > progress: 60.0 s, 8466.5 tps, lat 0.789 ms stddev 6.151 > > transaction type: > > scaling factor: 10 > > query mode: simple > > number of clients: 30 > > number of threads: 1 > > duration: 60 s > > number of transactions actually processed: 453042 > > latency average = 0.884 ms > > latency stddev = 7.182 ms > > tps = 7549.373416 (including connections establishing) > > tps = 7549.402629 (excluding connections establishing) > > Run was aborted; the above results are incomplete. > > > > Logs for that run show (truncated): > > > > > > 2019-08-07 00:19:37.707 UTC [22152] WARNING: PROXY: Failed to add new > > client - too much sessions: 18 clients, 1 backends. Try to increase > > 'max_sessions' configuration parameter. > > 2019-08-07 00:31:10.467 UTC [22151] WARNING: PROXY: Failed to add new > > client - too much sessions: 15 clients, 4 backends. Try to increase > > 'max_sessions' configuration parameter. > > 2019-08-07 00:31:10.468 UTC [22152] WARNING: PROXY: Failed to add new > > client - too much sessions: 15 clients, 4 backends. Try to increase > > 'max_sessions' configuration parameter.
Re: Built-in connection pooler
> I attached to this mail patch which is fixing both problems: correctly > reports error to the client and calculates number of idle clients). > Yes, this works much better with max_sessions=1000. Now it's handling the 300 connections on the small server. n_idle_clients now looks accurate with the rest of the stats here. postgres=# SELECT n_clients, n_backends, n_idle_backends, n_idle_clients FROM pg_pooler_state(); n_clients | n_backends | n_idle_backends | n_idle_clients ---++-+ 150 | 10 | 9 |149 150 | 10 | 6 | 146 Ryan Lambert >
Re: FETCH FIRST clause PERCENT option
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, passed The latest patch [1] and the cleanup patch [2] apply cleanly to the latest master (5f110933). I reviewed the conversation and don't see any outstanding questions or concerns. Updating status to ready for committer. [1] https://www.postgresql.org/message-id/attachment/103028/percent-incremental-v6.patch [2] https://www.postgresql.org/message-id/attachment/103157/percent-incremental-v6-comment-cleanup.patch Ryan Lambert The new status of this patch is: Ready for Committer
Re: FETCH FIRST clause PERCENT option
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Hi everyone, The v8 patch [1] applies cleanly and passes tests. I reviewed the conversation to date and from what I can see, all identified bugs have been fixed and concerns either fixed or addressed. Marking as ready for committer. [1] https://www.postgresql.org/message-id/attachment/103486/percent-incremental-v8.patch Ryan Lambert The new status of this patch is: Ready for Committer
Re: A rather hackish POC for alternative implementation of WITH TIES
On Wed, Jan 22, 2020 at 3:06 PM Alvaro Herrera wrote: > My own inclination is that Andrew's implementation, being more general > in nature, would be the better one to have in the codebase; but we don't > have a complete patch yet. Can we reach some compromise such as if > Andrew's patch is not completed then we push Surafel's? +1 On Wed, Jan 22, 2020 at 4:35 PM Andrew Gierth wrote: > I was largely holding off on doing further work hoping for some > discussion of which way we should go. If you think my approach is worth > pursuing (I haven't seriously tested the performance, but I'd expect it > to be slower than Surafel's - the price you pay for flexibility) then I > can look at it further, but figuring out the planner stuff will take > some time. Flexibility with more generalized code is good, though if performance is significantly slower I would be concerned. I quickly reviewed the patch but haven't tested it yet. Is it realistic to add PERCENT into this patch or would that be a future enhancement? Thanks, *Ryan Lambert*
Re: pg_dump new feature: exporting functions only. Bad or good idea ?
> On Sat, Jul 10, 2021 at 5:06 AM Tomas Vondra < tomas.von...@enterprisedb.com> wrote: > On 7/10/21 1:43 AM, Tom Lane wrote: >> Tomas Vondra writes: >>> The main question I have is whether this should include procedures. >> >> I feel a bit uncomfortable about sticking this sort of limited-purpose >> selectivity mechanism into pg_dump. I'd rather see a general filter >> method that can select object(s) of any type. Pavel was doing some >> work towards that awhile ago, though I think he got frustrated about >> the lack of consensus on details. Which is a problem, but I don't >> think the solution is to accrue more and more independently-designed- >> and-implemented features that each solve some subset of the big problem. >> > > I'm not against introducing such general filter mechanism, but why > should it block this patch? I'd understand it the patch was adding a lot > of code, but that's not the case - it's tiny. And we already have > multiple filter options (to pick tables, schemas, extensions, ...). > And if there's no consensus on details of Pavel's patch after multiple > commitfests, how likely is it it'll start moving forward? It seems to me that pg_dump already has plenty of limited-purpose options for selectivity, adding this patch seems to fit in with the norm. I'm in favor of this patch because it works in the same way the community is already used to and meets the need. The other patch referenced upstream appears to be intended to solve a specific problem encountered with very long commands. It is also introducing a new way of working with pg_dump via a config file, and honestly I've never wanted that type of feature. Not saying that wouldn't be useful, but that has not been a pain point for me and seems overly complex. For the use cases I imagine this patch will help with, being required to go through a config file seems excessive vs pg_dump --functions-only. > On Fri, Jul 9, 2021 at 4:43 PM Tomas Vondra wrote: > > The main question I have is whether this should include procedures. I'd > probably argue procedures should be considered different from functions > (i.e. requiring a separate --procedures-only option), because it pretty > much is meant to be a separate object type. We don't allow calling DROP > FUNCTION on a procedure, etc. It'd be silly to introduce an unnecessary > ambiguity in pg_dump and have to deal with it sometime later. +1 *Ryan Lambert* RustProof Labs
Re: Installation instructions update (pg_ctl)
I used the updated instructions from pg_ctl.diff to install from source. Worked well for me, new version is more consistent.
Re: FETCH FIRST clause PERCENT option
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: tested, passed Spec compliant: not tested Documentation:not tested The basic functionality works as I expect. In the following example I would have guessed it would return 4 rows instead of 5. I don't mind that it uses ceil here, but think that deserves a mention in the documentation. CREATE TABLE r100 (id INT); INSERT INTO r100 SELECT generate_series(1, 100); SELECT * FROM r100 FETCH FIRST 4.01 PERCENT ROWS ONLY; id 1 2 3 4 5 (5 rows) There's a missing space between the period and following sentence in src\backend\executor\nodeLimit.c "previous time we got a different result.In PERCENTAGE option there are" There's a missing space and the beginning "w" should be capitalized in doc\src\sgml\ref\select.sgml with PERCENT count specifies the maximum number of rows to return in percentage.ROW Another missing space after the period. previous time we got a different result.In PERCENTAGE option there are" Ryan Lambert The new status of this patch is: Waiting on Author
Re: FETCH FIRST clause PERCENT option
0 loops=1) Planning Time: 0.132 ms Execution Time: 7214.302 ms (5 rows) Time: 7215.278 ms (00:07.215) Cranking the percentage up to 95% took 59-75 seconds. EXPLAIN (ANALYZE, COSTS) WITH t AS ( SELECT id, v1, v2 FROM r10mwide FETCH FIRST 95 PERCENT ROWS ONLY ) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t ; QUERY PLAN - Aggregate (cost=651432.48..651432.49 rows=1 width=24) (actual time=58981.043..58981.044 rows=1 loops=1) -> Limit (cost=230715.67..461431.34 rows=9500057 width=20) (actual time=0.017..55799.389 rows=950 loops=1) -> Seq Scan on r10mwide (cost=0.00..242858.60 rows=1060 width=20) (actual time=0.014..3847.146 rows=1000 loops=1) Planning Time: 0.117 ms Execution Time: 59079.680 ms (5 rows) Even taking it way down to .001% (100 rows!) didn't run fast by any measure, more than 6 seconds. EXPLAIN (ANALYZE, COSTS) WITH t AS ( SELECT id, v1, v2 FROM r10mwide FETCH FIRST .001 PERCENT ROWS ONLY ) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t ; QUERY PLAN - Aggregate (cost=6.86..6.87 rows=1 width=24) (actual time=6414.406..6414.406 rows=1 loops=1) -> Limit (cost=2.43..4.86 rows=100 width=20) (actual time=0.021..6413.981 rows=100 loops=1) -> Seq Scan on r10mwide (cost=0.00..242858.60 rows=1060 width=20) (actual time=0.017..3086.504 rows=1000 loops=1) Planning Time: 0.168 ms Execution Time: 6495.686 ms [1] https://www.postgresql.org/message-id/attachment/102333/percent-incremental-v5.patch *Ryan Lambert* RustProof Labs On Mon, Jul 8, 2019 at 1:09 AM Surafel Temesgen wrote: > Hi Thomas, > Thank you for informing me > > Hi Surafel, >> >> There's a call to adjust_limit_rows_costs() hiding under >> contrib/postgres_fdw, so this fails check-world. >> > > Fixed . I also include the review given by Ryan in attached patch > > regards > Surafel >
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Hey everyone, Here is my input regarding nonces and randomness. > As I understand it, the NIST recommendation is a 96-bit *random* nonce, I could not find that exact requirement in the NIST documents, though given the volume of that library it would be possible to miss. The recommendation I repeatedly saw for the nonce was unique. There is also an important distinction that the nonce is not the Initialization Vector (IV), it can be used as part of the IV, more on that later. The most succinct definition for nonce I found was in SP-800-38A [1] page 4: "A value that is only used once." SP-800-90A [2] (page 6) expands on the definition: "A time-varying value that has at most a negligible chance of repeating, e.g., a random value that is generated anew for each use, a timestamp, a sequence number, or some combination of these." The second definition references randomness but does not require it. [1] (pg 19) reinforces the importance of uniqueness: "A procedure should be established to ensure the uniqueness of the message nonces" > That's certainly interesting, but such a brute-force with every key > would allow it, where, if you use a random nonce, then such an attack > would have to start working only after having access to the data, and > not be something that could be pre-computed > to talk about IV's not being secret An unpredictable IV can be generated using a non-random nonce including a counter, per [1] (pg 20): "The first method is to apply the forward cipher function, under the same key that is used for the encryption of the plaintext, to a nonce. The nonce must be a data block that is unique to each execution of the encryption operation. For example, the nonce may be a counter, as described in Appendix B, or a message number. The second method is to generate a random data block using a FIPS approved random number generator." A unique nonce gets passed through the cipher with the key, the uniqueness of the nonce is the strength with this method and the key + cipher handle the randomness for the IV. The second method listed above does require a random number generator and if chosen those must conform to [2]. > I'm not a fan of the idea of using something which is predictable as a > nonce. Using the username as the salt for our md5 password mechanism > was, all around, a bad idea. This seems like it's repeating that > mistake. Yeah that MD5 stuff wasn't the greatest. With MD5 and the username as a salt, the salt is known and you only need to work out the password. In reality, you only need to find a collision with that password, the high collision rate with MD5 (2^64) [3] made things really bad. That (collisions) is not a significant problem today with AES to the best of my knowledge. Further, knowing the nonce gets you nowhere, it isn't the salt until it is ran through the forward cipher with the encryption key. Even with the nonce the attacker has doesn't know the salt unless they steal the key, and that's a bigger problem. The strictest definition of nonce I found was in [2] (pg 19) defining nonces to use in the process of random generation: "The nonce shall be either: a. A value with at least (security_strength/2) bits of entropy, or b. A value that is expected to repeat no more often than a (security_strength/2)-bit random string would be expected to repeat." Even there it is randomness (a) or uniqueness (b). [1] https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38a.pdf [2] https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-90Ar1.pdf [3] https://stackoverflow.com/questions/8852668/what-is-the-clash-rate-for-md5 Thanks, Ryan Lambert RustProof Labs
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Hi Thomas, > CBC mode does require > random nonces, other modes may be fine with even sequences as long as > the values are not reused. I disagree that CBC mode requires random nonces, at least based on what NIST has published. They only require that the IV (not the nonce) must be unpredictable per [1]: " For the CBC and CFB modes, the IVs must be unpredictable." The unpredictable IV can be generated from a non-random nonce including a counter: "There are two recommended methods for generating unpredictable IVs. The first method is to apply the forward cipher function, under the same key that is used for the encryption of the plaintext, to a nonce. The nonce must be a data block that is unique to each execution of the encryption operation. For example, the nonce may be a counter, as described in Appendix B, or a message number." [1] https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38a.pdf Thanks, Ryan Lambert >
Re: FETCH FIRST clause PERCENT option
Surafel, > The cost of FITCH FIRST N PERCENT execution in current implementation is the cost of pulling the full table plus the cost of storing and fetching the tuple from tuplestore so it can > not perform better than pulling the full table in any case . This is because we can't determined the number of rows to return without executing the plan until the end. We can find the > estimation of rows that will be return in planner estimation but that is not exact. Ok, I can live with that for the normal use cases. This example from the end of my previous message using 95% seems like a problem still, I don't like syntax that unexpectedly kills performance like this one. If this can't be improved in the initial release of the feature I'd suggest we at least make a strong disclaimer in the docs, along the lines of: "It is possible for FETCH FIRST N PERCENT to create poorly performing query plans when the N supplied exceeds 50 percent. In these cases query execution can take an order of magnitude longer to execute than simply returning the full table. If performance is critical using an explicit row count for limiting is recommended." I'm not certain the 50 percent is the true threshold of where things start to fall apart, I just used that as a likely guess for now. I can do some more testing this week to identify where things start falling apart performance wise. Thanks, EXPLAIN (ANALYZE, COSTS) WITH t AS ( SELECT id, v1, v2 FROM r10mwide FETCH FIRST 95 PERCENT ROWS ONLY ) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t ; QUERY PLAN - Aggregate (cost=651432.48..651432.49 rows=1 width=24) (actual time=58981.043..58981.044 rows=1 loops=1) -> Limit (cost=230715.67..461431.34 rows=9500057 width=20) (actual time=0.017..55799.389 rows=950 loops=1) -> Seq Scan on r10mwide (cost=0.00..242858.60 rows=1060 width=20) (actual time=0.014..3847.146 rows=1000 loops=1) Planning Time: 0.117 ms Execution Time: 59079.680 ms (5 rows) Ryan Lambert > >
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
> What I think Tomas is getting at here is that we don't write a page only > once. > A nonce of tableoid+pagenum will only be unique the first time we write > out that page. Seems unlikely that we're only going to be writing these > pages once though- what we need is a nonce that's unique for *every > write* of the 8k page, isn't it? As every write of the page is going to > be encrypting something new. > With sufficient randomness, we can at least be more likely to have a > unique nonce for each 8K write. Including the LSN seems like it'd be a > possible alternative. Agreed. I know little of the inner details about the LSN but what I read in [1] sounds encouraging in addition to tableoid + pagenum. [1] https://www.postgresql.org/docs/current/datatype-pg-lsn.html Ryan Lambert >
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
If a random number were generated instead its result would need to be stored somewhere too, correct? > That might also allow things like backup software to work > on these encrypted data files for page-level backups without needing > access to the key and that'd be pretty neat. +1 Ryan
Re: FETCH FIRST clause PERCENT option
I did some more testing. I initialized a database with 1 million rows with indexes and joins to test against and ran pgbench with a few different settings for % to return. I started with a base query not utilizing the new functionality. The queries used are similar to my prior examples, code at [1]. createdb bench_test psql -d bench_test -f init/reporting.sql -v scale=10 The following provided 3.21 TPS and an average latency of 623. The "per_change_" columns in the table below use those values. pgbench -c 2 -j 2 -T 600 -P 60 -s 10 \ -f tests/reporting1.sql bench_test The remainder of the tests use the following, only adjusting fetch_percent value: pgbench -c 2 -j 2 -T 600 -P 60 -s 10 \ --define=fetch_percent=1 \ -f tests/reporting_fetch_percent.sql \ bench_test Returning 1% it runs well. By 10% the TPS drops by 30% while the average latency increases by 43%. When returning 95% of the table latency has increased by 548%. fetch_percent | tps | latency_avg_ms | per_change_tps | per_change_latency ---+--+++ 1 | 3.37 |593 | 0.05 | -0.05 5 | 2.85 |700 | -0.11 | 0.12 10 | 2.24 |891 | -0.30 | 0.43 25 | 1.40 | 1423 | -0.56 | 1.28 45 | 0.93 | 2147 | -0.71 | 2.45 95 | 0.49 | 4035 | -0.85 | 5.48 I manually tested the inner select queries without the outer aggregation thinking it might be a different story with a simple select and no CTE. Unfortunately it showed the same overall characteristics. 1% returns in about 550 ms, 45% took 1950, and 95% took 4050. [1] https://github.com/rustprooflabs/pgbench-tests Ryan >>
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
> what is it that gets stored in the page for > decryption use, the nonce or the IV derived from it? I believe storing the IV is preferable and still secure per [1]: "The IV need not be secret" Beyond needing the database oid, if every decrypt function has to regenerate the IV from the nonce that will affect performance. I don't know how expensive the forward hash is but it won't be free. [1] https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38a.pdf *Ryan Lambert*
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
> I didn't either, except it was referenced above as "forward hash". I > don't know why that was suggested, which is why I listed it as an > option/suggestion. My bad, sorry for the confusion! I meant to say "cipher" not "hash". I was (trying to) refer to the method of generating unpredictable IV from nonces using the forward cipher function and the encryption key. Too many closely related words with very specific meanings. Ryan >
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
> As I understand, the reason we > want to avoid using the same IV for too many pages is to dodge a > cryptanalysis attack, which requires a large amount of data encrypted > with the same key/IV in order to be effective. But if we have two > copies of the same page encrypted with the same key/IV, yes it's twice > as much data as just one copy of the page with that key/IV, but it still > seems like a sufficiently low amount of data that cryptanalysis is > unfeasible. Right? I mean, webservers send hundreds of kilobytes > encrypted with the same key; they avoid sending megabytes of it with the > same key/IV, but getting too worked up about 16 kB when we think 8 kB is > fine seems over the top. > So I guess the question is how much data is considered sufficient for a > successful, practical cryptanalysis attack? Yes, a cryptanalysis attack could hypothetically derive critical info about the key from two encrypted blocks with the same IV. A major (very important) difference with web servers is they use asymmetric encryption with the client to negotiate and share the secure symmetric encryption key for that session. The vast majority of the encryption work is done w/ short lived symmetric keys, not the TLS keys we all think of (because that's what we configure). Many DB encryption keys (symmetric) will live for a number of years, so the attack vectors and timelines are far different. By contrast, the longest CA TLS keys through paid vendors are typically 2 years, most are 1, LetsEncrypt certs only live 3 months. Are there any metrics on how long a page can live without being modified in one way or another to trigger it to re-encrypt with a new IV? Is it possible that a single page could live essentially forever without being modified? If its the latter than I would opt on the side of paranoid due to the expected lifecycle of keys. If it's the former it probably merits further discussion on the paranoia requirements. Another consideration is if someone can get this data and there are a LOT of pages sharing IVs the exposure increases significantly, probably not linearly. Another (probably bad) idea is if there was a REENCRYPT DATABASE, the hyper-paranoid could force a full rewrite as often as they want. Large databases seem to be the ones that are most likely to have long living pages, and the least likely to want to wait to reencrypt the whole thing. Ryan Lambert >
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
> >>Uh, what if a transaction modifies page 0 and page 1 of the same table > >>--- don't those pages have the same LSN. > > > >No, because WAL being a physical change log, each page gets its own > >WAL record with its own LSN. > > > > What if you have wal_log_hints=off? AFAIK that won't change the page LSN. > > Alvaro suggested elsewhere that we require checksums for these, which > would also force wal_log_hints to be on, and therefore the LSN would > change. Yes, it sounds like the agreement was LSN is unique when wal_log_hints is on. I don't know enough about the internals to know if pg_class.oid is also needed or not. Ryan On Wed, Jul 10, 2019 at 6:07 PM Bruce Momjian wrote: > On Thu, Jul 11, 2019 at 12:18:47AM +0200, Tomas Vondra wrote: > > On Wed, Jul 10, 2019 at 06:04:30PM -0400, Stephen Frost wrote: > > > Greetings, > > > > > > * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: > > > > On Wed, Jul 10, 2019 at 04:11:21PM -0400, Alvaro Herrera wrote: > > > > >On 2019-Jul-10, Bruce Momjian wrote: > > > > > > > > > >>Uh, what if a transaction modifies page 0 and page 1 of the same > table > > > > >>--- don't those pages have the same LSN. > > > > > > > > > >No, because WAL being a physical change log, each page gets its own > > > > >WAL record with its own LSN. > > > > > > > > > > > > > What if you have wal_log_hints=off? AFAIK that won't change the page > LSN. > > > > > > Alvaro suggested elsewhere that we require checksums for these, which > > > would also force wal_log_hints to be on, and therefore the LSN would > > > change. > > > > > > > Oh, I see - yes, that would solve the hint bits issue. Not sure we want > > to combine the features like this, though, as it increases the costs of > > TDE. But maybe it's the best solution. > > Uh, why can't we just force log_hint_bits for encrypted tables? Why > would we need to use checksums as well? > > Why is page-number not needed in the nonce? Because it is duplicative > of the LSN? Can we use just LSN? Do we need pg_class.oid too? > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + As you are, so once was I. As I am, so you will be. + > + Ancient Roman grave inscription + >
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
>> I vote for AES 256 rather than 128. > > Why? This page seems to think 128 is sufficient: > > https://crypto.stackexchange.com/questions/20/what-are-the-practical-differences-between-256-bit-192-bit-and-128-bit-aes-enc > > For practical purposes, 128-bit keys are sufficient to ensure security. > The larger key sizes exist mostly to satisfy some US military > regulations which call for the existence of several distinct "security > levels", regardless of whether breaking the lowest level is already far > beyond existing technology. After researching AES key sizes a bit more my vote is (surprisingly?) for AES-128. My reasoning is about security, I did not consider performance impacts in my decision. The purpose of longer keys over shorter keys is about ensuring brute-force attacks are prohibitively expensive. Finding a flaw in the algorithm is the goal of cryptanalysis. Brute force attacks are only advanced by increasing computing power. "The security of a symmetric cryptosystem is a function of two things: the strength of the algorithm and the length of the key. The former is more important... " [1] (pg 151) "The algorithm must be so secure that there is no better way to break it than with a brute-force attack." [1] (pg 152) Finally, a stated recommendation on key size: "Insist on at least 112-bit keys." [1] (pg 153) Schneier also mentions in that section that breaking an 80-bit key (brute force) is likely not realistic for another 30 years. ETA: 2045 based on dated published. Brute forcing a 128 bit key is much further in the future. Knowing the algorithm is the important part, onto the strength of the algorithm. The abstract from [2] states: "In the case of AES-128, there is no known attack which is faster than the 2^128 complexity of exhaustive search. However, AES-192 and AES-256 were recently shown to be breakable by attacks which require 2^176 and 2^119 time, respectively." This shows that both AES-128 (2^128) and AES-192 (2^176) both provide more protection in this case than the AES-256 algorithm provides (2^119). This may be surprising because all AES encryption does not work the same way, even though it's "all AES." Again from [2]: "The key schedules of AES-128 and AES-192 are slightly different, since they have to apply more mixing operations to the shorter key in order to produce the slightly smaller number of subkeys for the various rounds. This small difference in the key schedules plays a major role in making AES-256 more vulnerable to our attacks, in spite of its longer key and supposedly higher security." It appears the required key mixing that occurs with shorter key lengths is actually a strength of the underlying algorithm, and simplifying the key mixing is bad. They go on to restate this in a more succinct and damning way: "... it clearly indicates that this part of the design of AES-256 is seriously flawed." Schneier also mentions how small changes can have big impacts on the security: "strong cryptosystems, with a couple of minor changes, can become weak." [1] (pg 152) [1] Schneier, B. (2015). Applied Cryptography: Protocols, Algorithms and Source Code in C (20th Anniversary ed.). John Wiley & Sons. [2] Biryukov, A., Dunkelman, O., Keller, N., Khovratovich, D., & Shamir, A. (2009). Key Recovery Attacks of Practical Complexity on AES-256 Variants with up to 10 Rounds. Retreived from https://eprint.iacr.org/2009/374.pdf Ryan Lambert RustProof Labs
Re: FETCH FIRST clause PERCENT option
Surafel, On Wed, Jul 17, 2019 at 3:45 AM Surafel Temesgen wrote: > > Hi Ryan, > On Tue, Jul 9, 2019 at 4:13 PM Ryan Lambert > wrote: > >> >> "It is possible for FETCH FIRST N PERCENT to create poorly performing >> query plans when the N supplied exceeds 50 percent. In these cases query >> execution can take an order of magnitude longer to execute than simply >> returning the full table. If performance is critical using an explicit row >> count for limiting is recommended." >> > > I don’t understand how fetch first n percent functionality can be replaced > with explicit row count limiting. There may be a way to do it in a client > side but we can not be sure of its performance advantage > > > regards > > Surafel > > I was suggesting a warning in the documentation so users aren't caught unaware about the performance characteristics. My first version was very rough, how about adding this in doc/src/sgml/ref/select.sgml? "Using PERCENT is best suited to returning single-digit percentages of the query's total row count." The following paragraphs in that same section give suggestions and warnings regarding LIMIT and OFFSET usage, I think this is more in line with the wording of those existing warnings. Other than that, we can rip the clause if it is 100% You mean if PERCENT=100 it should short circuit and run the query normally? I like that. That got me thinking, I didn't check what happens with PERCENT>100, I'll try to test that soon. Thanks, Ryan >
Re: Built-in connection pooler
Hi Konstantin, Thanks for your work on this. I'll try to do more testing in the next few days, here's what I have so far. make installcheck-world: passed The v8 patch [1] applies, though I get indent and whitespace errors: :79: tab in indent. "Each proxy launches its own subset of backends. So maximal number of non-tainted backends is " :80: tab in indent. "session_pool_size*connection_proxies*databases*roles. :519: indent with spaces. char buf[CMSG_SPACE(sizeof(sock))]; :520: indent with spaces. memset(buf, '\0', sizeof(buf)); :522: indent with spaces. /* On Mac OS X, the struct iovec is needed, even if it points to minimal data */ warning: squelched 82 whitespace errors warning: 87 lines add whitespace errors. In connpool.sgml: "but it can be changed to standard Postgres 4321" Should be 5432? " As far as pooled backends are not terminated on client exist, it will not be possible to drop database to which them are connected." Active discussion in [2] might change that, it is also in this July commitfest [3]. "Unlike pgbouncer and other external connection poolera" Should be "poolers" "So developers of client applications still have a choice either to avoid using session-specific operations either not to use pooling." That sentence isn't smooth for me to read. Maybe something like: "Developers of client applications have the choice to either avoid using session-specific operations, or not use built-in pooling." [1] https://www.postgresql.org/message-id/attachment/102610/builtin_connection_proxy-8.patch [2] https://www.postgresql.org/message-id/flat/CAP_rwwmLJJbn70vLOZFpxGw3XD7nLB_7%2BNKz46H5EOO2k5H7OQ%40mail.gmail.com [3] https://commitfest.postgresql.org/23/2055/ Ryan Lambert On Tue, Jul 16, 2019 at 12:20 AM Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > > On 15.07.2019 17:04, Konstantin Knizhnik wrote: > > > > > > On 14.07.2019 8:03, Thomas Munro wrote: > >> > >> On my FreeBSD box (which doesn't have epoll(), so it's latch.c's old > >> school poll() for now), I see the connection proxy process eating a > >> lot of CPU and the temperature rising. I see with truss that it's > >> doing this as fast as it can: > >> > >> poll({ 13/POLLIN 17/POLLIN|POLLOUT },2,1000) = 1 (0x1) > >> > >> Ouch. I admit that I had the idea to test on FreeBSD because I > >> noticed the patch introduces EPOLLET and I figured this might have > >> been tested only on Linux. FWIW the same happens on a Mac. > >> > > I have committed patch which emulates epoll EPOLLET flag and so should > > avoid busy loop with poll(). > > I will be pleased if you can check it at FreeBSD box. > > > > > Sorry, attached patch was incomplete. > Please try this version of the patch. > >
Re: dropdb --force
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested Hi, The latest patch [1] applies cleanly and basic functionality retested. Looks great, thanks! https://www.postgresql.org/message-id/attachment/102334/drop-database-force-20190708.patch Ryan Lambert The new status of this patch is: Ready for Committer
Re: Built-in connection pooler
> I have fixed all reported issues except one related with "dropdb --force" > discussion. > As far as this patch is not yet committed, I can not rely on it yet. > Certainly I can just remove this sentence from documentation, assuming > that this patch will be committed soon. > But then some extra efforts will be needed to terminated pooler backends > of dropped database. Great, thanks. Understood about the non-committed item. I did mark that item as ready for committer last night so we will see. I should have time to put the actual functionality of your patch to test later today or tomorrow. Thanks, Ryan Lambert
Re: Built-in connection pooler
Here's what I found tonight in your latest patch (9). The output from git apply is better, fewer whitespace errors, but still getting the following. Ubuntu 18.04 if that helps. git apply -p1 < builtin_connection_proxy-9.patch :79: tab in indent. Each proxy launches its own subset of backends. :634: indent with spaces. union { :635: indent with spaces. struct sockaddr_in inaddr; :636: indent with spaces. struct sockaddr addr; :637: indent with spaces. } a; warning: squelched 54 whitespace errors warning: 59 lines add whitespace errors. A few more minor edits. In config.sgml: "If the max_sessions limit is reached new connection are not accepted." Should be "connections". "The default value is 10, so up to 10 backends will server each database," "sever" should be "serve" and the sentence should end with a period instead of a comma. In postmaster.c: /* The socket number we are listening for poolled connections on */ "poolled" --> "pooled" "(errmsg("could not create listen socket for locahost")));" "locahost" -> "localhost". " * so to support order balancing we should do dome smart work here." "dome" should be "some"? I don't see any tests covering this new functionality. It seems that this is significant enough functionality to warrant some sort of tests, but I don't know exactly what those would/should be. Thanks, Ryan