Re: [HACKERS] Block level parallel vacuum
On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada wrote: > > On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar wrote: > > > > On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada > > wrote: > > > > > > For more detail of my idea it is that the first worker who entered to > > > vacuum_delay_point adds its local value to shared value and reset the > > > local value to 0. And then the worker sleeps if it exceeds > > > VacuumCostLimit but before sleeping it can subtract VacuumCostLimit > > > from the shared value. Since vacuum_delay_point are typically called > > > per page processed I expect there will not such problem. Thoughts? > > > > Oh right, I assumed that when the local balance is exceeding the > > VacuumCostLimit that time you are adding it to the shared value but > > you are adding it to to shared value every time in vacuum_delay_point. > > So I think your idea is correct. > > I've attached the updated patch set. > > First three patches add new variables and a callback to index AM. > > Next two patches are the main part to support parallel vacuum. I've > incorporated all review comments I got so far. The memory layout of > variable-length index statistics might be complex a bit. It's similar > to the format of heap tuple header, having a null bitmap. And both the > size of index statistics and actual data for each indexes follows. > > Last patch is a PoC patch that implements the shared vacuum cost > balance. For now it's separated but after testing both approaches it > will be merged to 0004 patch. I'll test both next week. > > This patch set can be applied on top of the patch[1] that improves > gist index bulk-deletion. So canparallelvacuum of gist index is true. > > [1] > https://www.postgresql.org/message-id/CAFiTN-uQY%2BB%2BCLb8W3YYdb7XmB9hyYFXkAy3C7RY%3D-YSWRV1DA%40mail.gmail.com > I haven't yet read the new set of the patch. But, I have noticed one thing. That we are getting the size of the statistics using the AM routine. But, we are copying those statistics from local memory to the shared memory directly using the memcpy. Wouldn't it be a good idea to have an AM specific routine to get it copied from the local memory to the shared memory? I am not sure it is worth it or not but my thought behind this point is that it will give AM to have local stats in any form ( like they can store a pointer in that ) but they can serialize that while copying to shared stats. And, later when shared stats are passed back to the Am then it can deserialize in its local form and use it. + * Since all vacuum workers write the bulk-deletion result at + * different slots we can write them without locking. + */ + if (!shared_indstats->updated && stats[idx] != NULL) + { + memcpy(bulkdelete_res, stats[idx], shared_indstats->size); + shared_indstats->updated = true; + + /* + * no longer need the locally allocated result and now + * stats[idx] points to the DSM segment. + */ + pfree(stats[idx]); + stats[idx] = bulkdelete_res; + } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: Proposal: Add more compile-time asserts to expose inconsistencies.
From: Andres Freund Sent: Sunday, 27 October 2019 12:03 PM >>Ideally, the implementation should end up calling _Static_assert() >>somehow, so that we get the compiler's native error message. OK. I can work on that. >>We could do a configure check for whether _Static_assert() works at >>file scope. I don't know what the support for that is, but it seems to >>work in gcc and clang > I think it should work everywhere that has static assert. So we should need a > separate configure check. Er, that's a typo right? I think you meant: "So we *shouldn't* need a separate configure check" Kind Regards --- Peter Smith Fujitsu Australia
Re: Proposal: Add more compile-time asserts to expose inconsistencies.
On 2019-10-27 11:44:54 +, Smith, Peter wrote: > From: Andres Freund Sent: Sunday, 27 October 2019 12:03 > PM > > I think it should work everywhere that has static assert. So we should need > > a separate configure check. > > Er, that's a typo right? I think you meant: "So we *shouldn't* need a > separate configure check" Yes.
Re: Binary support for pgoutput plugin
> On Mon, Jun 17, 2019 at 10:29:26AM -0400, Dave Cramer wrote: > > Which is what I have done. Thanks > > > > I've attached both patches for comments. > > I still have to add documentation. > > Additional patch for documentation. Thank you for the patch! Unfortunately 0002 has some conflicts, could you please send a rebased version? In the meantime I have few questions: -LogicalRepRelId +void logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup) { charaction; - LogicalRepRelId relid; - - /* read the relation id */ - relid = pq_getmsgint(in, 4); action = pq_getmsgbyte(in); if (action != 'N') @@ -175,7 +173,6 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup) logicalrep_read_tuple(in, newtup); - return relid; } - relid = logicalrep_read_insert(s, &newtup); + /* read the relation id */ + relid = pq_getmsgint(s, 4); rel = logicalrep_rel_open(relid, RowExclusiveLock); + + logicalrep_read_insert(s, &newtup); Maybe I'm missing something, what is the reason of moving pq_getmsgint out of logicalrep_read_*? Just from the patch it seems that the code is equivalent. > There is one obvious hack where in binary mode I reset the input > cursor to allow the binary input to be re-read From what I can tell > the alternative is to convert the data in logicalrep_read_tuple but > that would require moving a lot of the logic currently in worker.c to > proto.c. This seems minimally invasive. Which logic has to be moved for example? Actually it sounds more natural to me, if this functionality would be in e.g. logicalrep_read_tuple rather than slot_store_data, since it has something to do with reading data. And it seems that in pglogical it's also located in pglogical_read_tuple.
Re: Proposition to use '==' as synonym for 'IS NOT DISTINCT FROM'
I wrote: > To clarify, what I have in mind here doesn't have any effect whatever > on the parse tree or the execution semantics, it's just about offering > an alternative SQL textual representation. Continuing this thread ... if we were just trying to fix the dump/restore issue without regard for verbosity, I think I'd propose that we implement syntaxes like x IS DISTINCT FROM y x IS DISTINCT (=) FROM y x IS DISTINCT (schema.=) FROM y x IS NOT DISTINCT FROM y x IS NOT DISTINCT (=) FROM y x IS NOT DISTINCT (schema.=) FROM y with the understanding that the parenthesized operator name is what to use for the underlying equality comparison, and that in the absence of any name, the parser looks up "=" (which is what it does today). Thus the first two alternatives are precisely equivalent, as are the fourth and fifth. Also, to support row-wise comparisons, we could allow cases like ROW(a,b) IS NOT DISTINCT (schema1.=, schema2.=) FROM ROW(x,y) ruleutils.c could proceed by looking up the operator(s) normally, and skipping the verbose syntax when they would print as just "=", so that we don't need to emit nonstandard SQL for common cases. I haven't actually checked to ensure that Bison can handle this, but since DISTINCT and FROM are both fully reserved words, it seems virtually certain that this would work without syntactic ambiguity. It also seems relatively understandable as to what it means. But of course, this is the exact opposite of addressing Eugen's concern about verbosity :-(. Can we pack the same functionality into fewer characters? regards, tom lane
Re: [Proposal] Arbitrary queries in postgres_fdw
On Fri, Oct 25, 2019 at 05:17:18PM +0200, rto...@carto.com wrote: > Dear all, > > We stumbled upon a few cases in which retrieving information from the > foreign server may turn pretty useful before creating any foreign > table, especially info related to the catalog. E.g: a list of schemas > or tables the user has access to. > > I thought of using dblink for it, but that requires duplication of > server and user mapping details and it adds its own management of > connections. > > Then I thought a better approach may be a mix of both: a function to > issue arbitrary queries to the foreign server reusing all the details > encapsulated in the server and user mapping. It would use the same > pool of connections. There's a SQL MED standard feature for CREATE ROUTINE MAPPING that does something similar to this. Might it be possible to incorporate it into the previous patch that implemented that feature? Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Selecting fields from a RowExpr
At pgconf.eu, someone whose name I've forgotten pointed out to me that this doesn't work: regression=# select (row(1, 2.0)).f1; ERROR: could not identify column "f1" in record data type LINE 1: select (row(1, 2.0)).f1; ^ The fields of an anonymous rowtype are certainly named f1, f2, etc, so it seems like this *should* work. A related case is regression=# select (row(1, 2.0)).*; ERROR: record type has not been registered Admittedly, these probably aren't terribly useful cases in practice, but it's unfortunate that they don't work as one would expect. So I propose the attached patch to make them work. The underlying reason for both of these failures is that RowExpr doesn't carry a typmod, so if it's of type RECORD then get_expr_result_type doesn't know how to find a tupdesc for it. The minimum-code solution is to teach get_expr_result_type to build a tupdesc directly from the RowExpr, and that seems to be necessary for complicated cases like select (r).f1 from (select row(1, 2.0) as r) ss; In an earlier version of the patch I chose to add in some fast-path logic in ParseComplexProjection and ExpandRowReference, so as to make the really simple cases shown above a bit less inefficient. But on second thought, these are such corner cases that it doesn't seem worth carrying extra code for them. The cases that are more likely to arise in practice are like that last example, and we can't optimize that in the parser. (The planner will optimize FieldSelect-from-RowExpr after flattening subqueries, which is probably as much as we really need to do here.) I don't feel a need to back-patch this, but I would like to push it into HEAD. Thoughts? regards, tom lane diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c index b7fac5d..4688fbc 100644 --- a/src/backend/utils/fmgr/funcapi.c +++ b/src/backend/utils/fmgr/funcapi.c @@ -227,6 +227,38 @@ get_expr_result_type(Node *expr, NULL, resultTypeId, resultTupleDesc); + else if (expr && IsA(expr, RowExpr) && + ((RowExpr *) expr)->row_typeid == RECORDOID) + { + /* We can resolve the record type by generating the tupdesc directly */ + RowExpr*rexpr = (RowExpr *) expr; + TupleDesc tupdesc; + AttrNumber i = 1; + ListCell *lcc, + *lcn; + + tupdesc = CreateTemplateTupleDesc(list_length(rexpr->args)); + Assert(list_length(rexpr->args) == list_length(rexpr->colnames)); + forboth(lcc, rexpr->args, lcn, rexpr->colnames) + { + Node *col = (Node *) lfirst(lcc); + char *colname = strVal(lfirst(lcn)); + + TupleDescInitEntry(tupdesc, i, + colname, + exprType(col), + exprTypmod(col), + 0); + TupleDescInitEntryCollation(tupdesc, i, + exprCollation(col)); + i++; + } + if (resultTypeId) + *resultTypeId = rexpr->row_typeid; + if (resultTupleDesc) + *resultTupleDesc = BlessTupleDesc(tupdesc); + return TYPEFUNC_COMPOSITE; + } else { /* handle as a generic expression; no chance to resolve RECORD */ diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out index a272305..2a273f8 100644 --- a/src/test/regress/expected/rowtypes.out +++ b/src/test/regress/expected/rowtypes.out @@ -436,6 +436,45 @@ where i8 in (row(123,456)::int8_tbl, '(4567890123456789,123)'); 4567890123456789 | 123 (2 rows) +-- Check ability to select columns from an anonymous rowtype +select (row(1, 2.0)).f1; + f1 + + 1 +(1 row) + +select (row(1, 2.0)).f2; + f2 +- + 2.0 +(1 row) + +select (row(1, 2.0)).nosuch; -- fail +ERROR: could not identify column "nosuch" in record data type +LINE 1: select (row(1, 2.0)).nosuch; +^ +select (row(1, 2.0)).*; + f1 | f2 ++- + 1 | 2.0 +(1 row) + +select (r).f1 from (select row(1, 2.0) as r) ss; + f1 + + 1 +(1 row) + +select (r).f3 from (select row(1, 2.0) as r) ss; -- fail +ERROR: could not identify column "f3" in record data type +LINE 1: select (r).f3 from (select row(1, 2.0) as r) ss; +^ +select (r).* from (select row(1, 2.0) as r) ss; + f1 | f2 ++- + 1 | 2.0 +(1 row) + -- Check some corner cases involving empty rowtypes select ROW(); row diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql index 7e080c0..83cf4a1 100644 --- a/src/test/regress/sql/rowtypes.sql +++ b/src/test/regress/sql/rowtypes.sql @@ -171,6 +171,15 @@ where i8 in (row(123,456)::int8_tbl, '(4567890123456789,123)'); select * from int8_tbl i8 where i8 in (row(123,456)::int8_tbl, '(4567890123456789,123)'); +-- Check ability to select columns from an anonymous rowtype +select (row(1, 2.0)).f1; +select (row(1, 2.0)).f2; +select (row(1, 2.0)).nosuch; -- fail +select (row(1, 2.0)).*; +select (r).f1 from (select row(1, 2.0) as r) ss; +select (r).f3 from (select row(1, 2.0) as r) ss; -- fail +select (r).* from (select row(1, 2.0) as r) ss; + -- Check some corner cases
Re: Selecting fields from a RowExpr
Hi ne 27. 10. 2019 v 19:47 odesílatel Tom Lane napsal: > At pgconf.eu, someone whose name I've forgotten pointed out to me > that this doesn't work: > > regression=# select (row(1, 2.0)).f1; > ERROR: could not identify column "f1" in record data type > LINE 1: select (row(1, 2.0)).f1; > ^ > > The fields of an anonymous rowtype are certainly named f1, f2, etc, > so it seems like this *should* work. A related case is > > regression=# select (row(1, 2.0)).*; > ERROR: record type has not been registered > > Admittedly, these probably aren't terribly useful cases in practice, > but it's unfortunate that they don't work as one would expect. > So I propose the attached patch to make them work. > > The underlying reason for both of these failures is that RowExpr > doesn't carry a typmod, so if it's of type RECORD then > get_expr_result_type doesn't know how to find a tupdesc for it. > The minimum-code solution is to teach get_expr_result_type to build > a tupdesc directly from the RowExpr, and that seems to be necessary > for complicated cases like > > select (r).f1 from (select row(1, 2.0) as r) ss; > > In an earlier version of the patch I chose to add in some fast-path > logic in ParseComplexProjection and ExpandRowReference, so as to > make the really simple cases shown above a bit less inefficient. > But on second thought, these are such corner cases that it doesn't > seem worth carrying extra code for them. The cases that are more > likely to arise in practice are like that last example, and we > can't optimize that in the parser. (The planner will optimize > FieldSelect-from-RowExpr after flattening subqueries, which is > probably as much as we really need to do here.) > > I don't feel a need to back-patch this, but I would like to push > it into HEAD. > some times I hit this limit, an can be nice more consistent behave of composite types. It's new feature - and there is not a reason for back-patching Regards Pavel > > Thoughts? > > regards, tom lane > >
Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7)
Hi, Peter! On Fri, Oct 4, 2019 at 12:05 AM Alexander Korotkov wrote: > I'm planning to continue work on README, comments and commit messages. It tool me a lot of efforts to revise a concurrency section in README. I can't judge the result, but I probably did my best. I'd like to commit this patchset including both bug fixes and README update. But I'd like you to take a look on the README patch first. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company 0001-Fix-deadlock-between-ginDeletePage-and-ginStepRigh-3.patch Description: Binary data 0002-Fix-traversing-to-the-deleted-GIN-page-via-downlin-3.patch Description: Binary data 0003-Revise-concurrency-section-in-GIN-README-3.patch Description: Binary data
update ALTER TABLE with ATTACH PARTITION lock mode
commit #898e5e32 (Allow ATTACH PARTITION with only ShareUpdateExclusiveLock) updates ddl.sgml but not alter_table.sgml, which only says: https://www.postgresql.org/docs/12/release-12.html |An ACCESS EXCLUSIVE lock is held unless explicitly noted. Find attached patch, which also improve language in several related places. "Without such a constraint": SUCH could refer to either of the constraints.. "because it is no longer necessary.": In our use case, we prefer to keep the redundant constraint, to avoid having to add it back if we detach/reattach again in the future.. >From c820a81fba0a6c2388ec58fc0204ca833523e81e Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 27 Oct 2019 18:54:24 -0500 Subject: [PATCH v1 1/2] Mention reduced locking strength of ATTACH PARTITION.. See commit 898e5e32 --- doc/src/sgml/ref/alter_table.sgml | 7 +++ 1 file changed, 7 insertions(+) diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index c8dfa19..a184bed 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -900,6 +900,13 @@ WITH ( MODULUS numeric_literal, REM the scan of the new partition, it is always skipped when the default partition is a foreign table. + + + Attaching a partition acquires a SHARE UPDATE EXCLUSIVE + lock on the partitioned table, in addition to an + ACCESS EXCLUSIVE lock on the partition. + + -- 2.7.4 >From b1c9c50228ebd3d2d511382ebd6cbae08788e376 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 27 Oct 2019 18:38:23 -0500 Subject: [PATCH v1 2/2] Tweak language for ATTACH PARTITION docs --- doc/src/sgml/ddl.sgml | 8 doc/src/sgml/ref/alter_table.sgml | 12 ++-- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index b097b80..8b60d8b 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3972,14 +3972,14 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02 Before running the ATTACH PARTITION command, it is recommended to create a CHECK constraint on the table to - be attached describing the desired partition constraint. That way, + be attached matching the desired partition constraint. That way, the system will be able to skip the scan which is otherwise needed to validate the implicit - partition constraint. Without such a constraint, the table will be + partition constraint. Without the CHECK constraint, the table will be scanned to validate the partition constraint while holding an ACCESS EXCLUSIVE lock on that partition and a SHARE UPDATE EXCLUSIVE lock on the parent table. - One may then drop the constraint after ATTACH PARTITION - is finished, because it is no longer necessary. + It may be desired to drop the redundant CHECK constraint + after ATTACH PARTITION is finished. diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index a184bed..91ec626 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -841,7 +841,7 @@ WITH ( MODULUS numeric_literal, REM or as a default partition by using DEFAULT. For each index in the target table, a corresponding one will be created in the attached table; or, if an equivalent - index already exists, will be attached to the target table's index, + index already exists, it will be attached to the target table's index, as if ALTER INDEX ATTACH PARTITION had been executed. Note that if the existing table is a foreign table, it is currently not allowed to attach the table as a partition of the target table if there @@ -864,17 +864,17 @@ WITH ( MODULUS numeric_literal, REM already exist. If any of the CHECK constraints of the table being attached are marked NO INHERIT, the command will fail; - such a constraint must be recreated without the NO INHERIT + such constraints must be recreated without the NO INHERIT clause. If the new partition is a regular table, a full table scan is performed - to check that no existing row in the table violates the partition + to check that existing rows in the table do not violate the partition constraint. It is possible to avoid this scan by adding a valid - CHECK constraint to the table that would allow only - the rows satisfying the desired partition constraint before running this - command. It will be determined using such a constraint that the table + CHECK constraint to the table that allows only + rows satisfying the desired partition constraint before running this + command. The CHECK constraint will be used to determine that the table need not be scanned to validate the partition constraint. This does not work, however, if any of the part
RE: Proposal: Add more compile-time asserts to expose inconsistencies.
From: Andres Freund Sent: Sunday, 27 October 2019 12:03 PM > My proposal for this really was just to use this as a fallback for when > static assert isn't available. Which in turn I was just suggesting because > Tom wanted a fallback. The patch is updated to use "extern" technique only when "_Static_assert" is unavailable. PSA. Kind Regards, --- Peter Smith Fujitsu Australia ct_asserts_StaticAssertDecl_3.patch Description: ct_asserts_StaticAssertDecl_3.patch
Re: PL/Python fails on new NetBSD/PPC 8.0 install
Awhile back I wrote: > I noticed that the old NetBSD 5.1.5 installation I had on my G4 Mac > was no longer passing our regression tests, because it has a strtof() > that is sloppy about underflow. Rather than fight with that I decided > to update it to something shinier (well, as shiny as you can get on > hardware that's old enough to apply for a driver's license). I stuck in > NetBSD/macppc 8.0, and things seem to work, except that PL/Python > crashes on launch. I see something like this in the postmaster log: > Traceback (most recent call last): > File "", line 1162, in > _install_external_importers > File "", line 980, in _find_and_load > File "", line 149, in __enter__ > File "", line 84, in acquire > RuntimeError: no current thread ident > Fatal Python error: initexternalimport: external importer setup failed > > Current thread 0x (most recent call first): > 2019-06-18 17:40:59.629 EDT [20764] LOG: server process (PID 23714) was > terminated by signal 6: Abort trap > 2019-06-18 17:40:59.629 EDT [20764] DETAIL: Failed process was running: > CREATE FUNCTION stupid() RETURNS text AS 'return "zarkon"' LANGUAGE > plpython3u; So ... I just got this identical failure on NetBSD 8.1 on a shiny new Intel NUC box. So that removes the excuse of old unsupported hardware, and leaves us with the conclusion that PL/Python is flat out broken on recent NetBSD. This is with today's HEAD of our code and the python37-3.7.4/amd64 package from NetBSD 8.1. BTW, the only somewhat-modern NetBSD machine in our buildfarm is coypu, which is running NetBSD/macppc 8.0 ... but what it is testing PL/Python against is python 2.7.15, so the fact that it doesn't fail can probably be explained as a python 2 vs python 3 thing. regards, tom lane
Re: PL/Python fails on new NetBSD/PPC 8.0 install
None of the output provides any clue to me but I do know that Python 3.7 has some issues with a lot of versions of openssl that is based on a disagreement between devs in both projects. This was a problem for me when trying to build python 3.7 on my Kubuntu 14.04 system. I've seen this issue reported across all targets for Python including Freebsd so I expect it's likely to also happen for NetBSD. Perhaps this might be related to the problem? On Mon, Oct 28, 2019, 8:12 AM Tom Lane wrote: > Awhile back I wrote: > > I noticed that the old NetBSD 5.1.5 installation I had on my G4 Mac > > was no longer passing our regression tests, because it has a strtof() > > that is sloppy about underflow. Rather than fight with that I decided > > to update it to something shinier (well, as shiny as you can get on > > hardware that's old enough to apply for a driver's license). I stuck in > > NetBSD/macppc 8.0, and things seem to work, except that PL/Python > > crashes on launch. I see something like this in the postmaster log: > > > Traceback (most recent call last): > > File "", line 1162, in > _install_external_importers > > File "", line 980, in _find_and_load > > File "", line 149, in __enter__ > > File "", line 84, in acquire > > RuntimeError: no current thread ident > > Fatal Python error: initexternalimport: external importer setup failed > > > > Current thread 0x (most recent call first): > > 2019-06-18 17:40:59.629 EDT [20764] LOG: server process (PID 23714) was > terminated by signal 6: Abort trap > > 2019-06-18 17:40:59.629 EDT [20764] DETAIL: Failed process was running: > CREATE FUNCTION stupid() RETURNS text AS 'return "zarkon"' LANGUAGE > plpython3u; > > So ... I just got this identical failure on NetBSD 8.1 on a shiny > new Intel NUC box. So that removes the excuse of old unsupported > hardware, and leaves us with the conclusion that PL/Python is > flat out broken on recent NetBSD. > > This is with today's HEAD of our code and the python37-3.7.4/amd64 > package from NetBSD 8.1. > > BTW, the only somewhat-modern NetBSD machine in our buildfarm is > coypu, which is running NetBSD/macppc 8.0 ... but what it is testing > PL/Python against is python 2.7.15, so the fact that it doesn't > fail can probably be explained as a python 2 vs python 3 thing. > > regards, tom lane > > >
Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+
Hello Tomas, On 2019/10/13 10:26, Tomas Vondra wrote: over in pgsql-bugs [1] we got a report about CREATE TEXT SEARCH DICTIONARY causing segfaults on 12.0. Simply running CREATE TEXT SEARCH DICTIONARY hunspell_num (Template=ispell, DictFile=hunspell_sample_num, AffFile=hunspell_sample_long); does trigger a crash, 100% of the time. The crash was reported on 12.0, but it's in fact present since 9.6. On 9.5 the example does not work, because that version does not (a) include the hunspell dictionaries used in the example, and (b) it does not support long flags. So even after copying the dictionaries and tweaking them a bit it still passes without a crash. This crash is not because of long flags, but because of aliases (more thoughts below). Looking at the commit history of spell.c, there seems to be a bunch of commits in 2016 (e.g. f4ceed6ceba3) touching exactly this part of the code (hunspell), and it also correlates quite nicely with the affected branches (9.6+). So my best guess is it's a bug in those changes. Yeah, there was a lot changes. So it simply grabs whatever it finds in the dict file, parses it and then (later) we use it as index to access the AffixData array, even if the value is way out of bounds. Yes, we enter this code if an affix file defines aliases (AF parameter). AffixData array is used to store those aliases. More about hunspell format you can find here: https://linux.die.net/man/4/hunspell In the example we have the following aliases: AF 11 AF cZ #1 AF cL #2 ... AF sB #11 And in the dictionary file we should use their indexes (from 1 to 11). These aliases defines set of flags and in the dict file we can use only single index: book/3 book/11 but not: book/3,4 book/2,11 I added checking of this last case in the attached patch. PostgreSQL will raise an error if it sees non-numeric and non-whitespace character after the index. Aliases can be used with all flag types: 'default' (i.e. FM_CHAR), 'long', and if I'm not mistaken 'num'. So I think we need some sort of cross-check here. We certainly need to make NISortDictionary() check the affix value is within AffixData bounds, and error out when the index is non-sensical (maybe negative and/or exceeding nAffixData). I agree, I attached the patch which do this. I also added couple asserts, tests and fixed condition in getAffixFlagSet(): - if (curaffix > 0 && curaffix <= Conf->nAffixData) + if (curaffix > 0 && curaffix < Conf->nAffixData) I think it could be a bug, because curaffix can't be equal to Conf->nAffixData. Maybe there's a simple way to check if the affix/dict files match. I'm not sure how to properly fix this either. The only thing we could check is commas in affix flags in a dict file: book/302,301,202,303 FM_CHAR and FM_LONG dictionaries can't have commas. They should have the following affix flags: book/sGsJpUsS # 4 affixes for FM_LONG book/GJUS # 4 affixes for FM_CHAR But I guess they could have numbers in flags (as help says "Set flag type. Default type is the extended ASCII (8-bit) character.") and other non alphanumeric characters (as some language dictionaries have): book/s1s2s3s4 # 4 affixes for FM_LONG -- Artur diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c index c715f06b8e..1b8766659c 100644 --- a/src/backend/tsearch/spell.c +++ b/src/backend/tsearch/spell.c @@ -458,6 +458,8 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag) if (*affixflag == 0) return true; + Assert(affix < Conf->nAffixData); + flagcur = Conf->AffixData[affix]; while (*flagcur) @@ -1160,13 +1162,17 @@ getAffixFlagSet(IspellDict *Conf, char *s) (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg("invalid affix alias \"%s\"", s))); - if (curaffix > 0 && curaffix <= Conf->nAffixData) + if (curaffix > 0 && curaffix < Conf->nAffixData) /* * Do not subtract 1 from curaffix because empty string was added * in NIImportOOAffixes */ return Conf->AffixData[curaffix]; + else if (curaffix > Conf->nAffixData) + ereport(ERROR, + (errcode(ERRCODE_CONFIG_FILE_ERROR), +errmsg("invalid affix alias \"%s\"", s))); else return VoidString; } @@ -1561,6 +1567,8 @@ MergeAffix(IspellDict *Conf, int a1, int a2) { char **ptr; + Assert(a1 < Conf->nAffixData && a2 < Conf->nAffixData); + /* Do not merge affix flags if one of affix flags is empty */ if (*Conf->AffixData[a1] == '\0') return a2; @@ -1603,9 +1611,10 @@ MergeA
RE: Proposal: Add more compile-time asserts to expose inconsistencies.
From: Kyotaro Horiguchi Sent: Monday, 28 October 2019 1:26 PM > It is missing the __cplusplus case? My use cases for the macro are only in C code, so that's all I was interested in at this time. If somebody else wants to extend the patch for C++ also (and test it) they can do. Kind Regards, --- Peter Smith Fujitsu Australia
Allow cluster_name in log_line_prefix
Hi folks I was recently surprised to notice that log_line_prefix doesn't support a cluster_name placeholder. I suggest adding one. If I don't hear objections I'll send a patch. Before anyone asks "but why?!": * A constant (short) string in log_line_prefix is immensely useful when working with logs from multi-node systems. Whether that's physical streaming replication, logical replication, Citus, whatever, it doesn't matter. It's worth paying the small storage price for sanity when looking at logs. * Yes you can embed it directly into log_line_prefix. But then it gets copied by pg_basebackup or whatever you're using to clone standbys etc, so you can easily land up with multiple instances reporting the same name. This rather defeats the purpose. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
Re: Restore replication settings when modifying a field type
Hello. # The patch no longer applies on the current master. Needs a rebasing. At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang wrote in > In fact, the replication property of the table has not been modified, > and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously > specified index property 'indisreplident' is set to false because of > the rebuild. I suppose that the behavior is intended. Change of column types on the publisher side can break the agreement on replica identity with subscribers. Thus replica identity setting cannot be restored unconditionally. For (somewhat artifitial :p) example: P=# create table t (c1 integer, c2 text unique not null); P=# alter table t replica identity using index t_c2_key; P=# create publication p1 for table t; P=# insert into t values (0, '00'), (1, '01'); S=# create table t (c1 integer, c2 text unique not null); S=# alter table t replica identity using index t_c2_key; S=# create subscription s1 connection '...' publication p1; Your patch allows change of the type of c2 into integer. P=# alter table t alter column c2 type integer using c2::integer; P=# update t set c1 = c1 + 1 where c2 = '01'; This change doesn't affect perhaps as expected. S=# select * from t; c1 | c2 + 0 | 00 1 | 01 (2 rows) > So I developed a patch. If the user modifies the field type. The > associated index is REPLICA IDENTITY. Rebuild and restore replication > settings. Explicit setting of replica identity premises that they are sure that the setting works correctly. Implicit rebuilding after a type change can silently break it. At least we need to guarantee that the restored replica identity setting is truly compatible with all existing subscribers. I'm not sure about potential subscribers.. Anyway I think it is a problem that replica identity setting is dropped silently. Perhaps a message something like "REPLICA IDENTITY setting is lost, please redefine after confirmation of compatibility with subscribers." is needed. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Remove one use of IDENT_USERNAME_MAX
Hello. At Sat, 26 Oct 2019 08:55:03 +0200, Peter Eisentraut wrote in > IDENT_USERNAME_MAX is the maximum length of the information returned > by an ident server, per RFC 1413. Using it as the buffer size in peer > authentication is inappropriate. It was done here because of the > historical relationship between peer and ident authentication. But > since it's also completely useless code-wise, remove it. In think one of the reasons for the coding is the fact that *pw is described to be placed in the static area, which can be overwritten by succeeding calls to getpw*() functions. I think we can believe check_usermap() never calls them but I suppose that some comments needed.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [DOC] Fix for the missing pg_stat_progress_cluster view phase column value
At Sat, 26 Oct 2019 05:13:49 +, "Shinoda, Noriyoshi (PN Japan A&PS Delivery)" wrote in > I found a missing column value in the pg_stat_progress_cluster view document. > I read the src/backend/catalog/system_views.sql file, there seems to be a > possibility that 'writing new heap' is output in the 'phase' column. > The attached patch adds a description of the 'writing new heap' value output > in the 'phase' column. Good catch! By the way the table mentions the phases common to CLUSTER and VACUUM FULL. I wonder why some of them are described as "CLUSTER is" and others are "The command is".. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY
Hi Dhruv, On Sun, June 30, 2019 at 7:30 AM, Goel, Dhruv wrote: > > On Jun 10, 2019, at 1:20 PM, Goel, Dhruv wrote: > >> On Jun 9, 2019, at 5:33 PM, Tom Lane wrote: > >> Andres Freund writes: > >>> On June 9, 2019 8:36:37 AM PDT, Tom Lane wrote: > I think you are mistaken that doing transactional updates in > pg_index is OK. If memory serves, we rely on xmin of the pg_index > row for purposes such as detecting whether a concurrently-created > index is safe to use yet. > > > > I took a deeper look regarding this use case but was unable to find more > > evidence. As part of this patch, we essentially > make concurrently-created index safe to use only if transaction started after > the xmin of Phase 3. Even today concurrent > indexes can not be used for transactions before this xmin because of the wait > (which I am trying to get rid of in this > patch), is there any other denial of service you are talking about? Both the > other states indislive, indisready can > be transactional updates as far as I understand. Is there anything more I am > missing here? > > > Hi, > > I did some more concurrency testing here through some python scripts which > compare the end state of the concurrently > created indexes. I also back-ported this patch to PG 9.6 and ran some custom > concurrency tests (Inserts, Deletes, and > Create Index Concurrently) which seem to succeed. The intermediate states > unfortunately are not easy to test in an > automated manner, but to be fair concurrent indexes could never be used for > older transactions. Do you have more > inputs/ideas on this patch? According to the commit 3c8404649 [1], transactional update in pg_index is not safe in non-MVCC catalog scans before PG9.4. But it seems to me that we can use transactional update in pg_index after the commit 813fb03155 [2] which got rid of SnapshotNow. If we apply this patch back to 9.3 or earlier, we might need to consider another way or take the Andres suggestion (which I don't understand the way fully though), but which version do you want/do we need to apply this patch? Also, if we apply this patch in this way, there are several comments to be fixed which state the method of CREATE INDEX CONCURRENTLY. ex. [index.c] /* * validate_index - support code for concurrent index builds ... * After completing validate_index(), we wait until all transactions that * were alive at the time of the reference snapshot are gone; this is * necessary to be sure there are none left with a transaction snapshot * older than the reference (and hence possibly able to see tuples we did * not index). Then we mark the index "indisvalid" and commit. Subsequent * transactions will be able to use it for queries. ... valiate_index() { } [1] https://github.com/postgres/postgres/commit/3c84046490bed3c22e0873dc6ba492e02b8b9051#diff-b279fc6d56760ed80ce4178de1401d2c [2] https://github.com/postgres/postgres/commit/813fb0315587d32e3b77af1051a0ef517d187763#diff-b279fc6d56760ed80ce4178de1401d2c -- Yoshikazu Imai
Re: [HACKERS] Block level parallel vacuum
On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar wrote: > > On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila wrote: > > > > On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar wrote: > > > > > > On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar > > > wrote: > > > > > > > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila > > > > wrote: > > > > > > > > > > I am thinking if we can write the patch for both the approaches (a. > > > > > compute shared costs and try to delay based on that, b. try to divide > > > > > the I/O cost among workers as described in the email above[1]) and do > > > > > some tests to see the behavior of throttling, that might help us in > > > > > deciding what is the best strategy to solve this problem, if any. > > > > > What do you think? > > > > > > > > I agree with this idea. I can come up with a POC patch for approach > > > > (b). Meanwhile, if someone is interested to quickly hack with the > > > > approach (a) then we can do some testing and compare. Sawada-san, > > > > by any chance will you be interested to write POC with approach (a)? > > > > Otherwise, I will try to write it after finishing the first one > > > > (approach b). > > > > > > > I have come up with the POC for approach (a). > > Can we compute the overall throttling (sleep time) in the operation > > separately for heap and index, then divide the index's sleep_time with > > a number of workers and add it to heap's sleep time? Then, it will be > > a bit easier to compare the data between parallel and non-parallel > > case. I have come up with a patch to compute the total delay during the vacuum. So the idea of computing the total cost delay is Total cost delay = Total dealy of heap scan + Total dealy of index/worker; Patch is attached for the same. I have prepared this patch on the latest patch of the parallel vacuum[1]. I have also rebased the patch for the approach [b] for dividing the vacuum cost limit and done some testing for computing the I/O throttling. Attached patches 0001-POC-compute-total-cost-delay and 0002-POC-divide-vacuum-cost-limit can be applied on top of v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't rebased on top of v31-0006, because v31-0006 is implementing the I/O throttling with one approach and 0002-POC-divide-vacuum-cost-limit is doing the same with another approach. But, 0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as well (just 1-2 lines conflict). Testing: I have performed 2 tests, one with the same size indexes and second with the different size indexes and measured total I/O delay with the attached patch. Setup: VacuumCostDelay=10ms VacuumCostLimit=2000 Test1 (Same size index): create table test(a int, b varchar, c varchar); create index idx1 on test(a); create index idx2 on test(b); create index idx3 on test(c); insert into test select i, repeat('a',30)||i, repeat('a',20)||i from generate_series(1,50) as i; delete from test where a < 20; Vacuum (Head) Parallel Vacuum Vacuum Cost Divide Patch Total Delay1784 (ms) 1398(ms) 1938(ms) Test2 (Variable size dead tuple in index) create table test(a int, b varchar, c varchar); create index idx1 on test(a); create index idx2 on test(b) where a > 10; create index idx3 on test(c) where a > 15; insert into test select i, repeat('a',30)||i, repeat('a',20)||i from generate_series(1,50) as i; delete from test where a < 20; Vacuum (Head) Parallel Vacuum Vacuum Cost Divide Patch Total Delay 1438 (ms) 1029(ms) 1529(ms) Conclusion: 1. The tests prove that the total I/O delay is significantly less with the parallel vacuum. 2. With the vacuum cost divide the problem is solved but the delay bit more compared to the non-parallel version. The reason could be the problem discussed at[2], but it needs further investigation. Next, I will test with the v31-0006 (shared vacuum cost) patch. I will also try to test different types of indexes. [1] https://www.postgresql.org/message-id/CAD21AoBMo9dr_QmhT%3DdKh7fmiq7tpx%2ByLHR8nw9i5NZ-SgtaVg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAA4eK1%2BPeiFLdTuwrE6CvbNdx80E-O%3DZxCuWB2maREKFD-RaCA%40mail.gmail.com -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com 0001-POC-compute-total-cost-delay.patch Description: Binary data 0002-POC-divide-vacuum-cost-limit.patch Description: Binary data
Re: [DOC] Fix for the missing pg_stat_progress_cluster view phase column value
On Sat, Oct 26, 2019 at 05:13:49AM +, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote: > The attached patch adds a description of the 'writing new heap' > value output in the 'phase' column. Indeed, fixed. Thanks for the patch. -- Michael signature.asc Description: PGP signature
Re: define bool in pgtypeslib_extern.h
On Sat, Oct 26, 2019 at 10:49 PM Tom Lane wrote: > > I'm inclined to think that we need to make ecpglib.h's bool-related > definitions exactly match c.h, which will mean that it has to pull in > on most platforms, which will mean adding a control symbol > for that to ecpg_config.h. I do not think we should export > HAVE_STDBOOL_H and SIZEOF_BOOL there though; probably better to have > configure make the choice and export something named like PG_USE_STDBOOL. > This sounds reasonable to me, but we also might want to do something for probes.d. To be clear, I am not immediately planning to write a patch for this. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Duplicate entries in pg_depend after REINDEX CONCURRENTLY
On Fri, Oct 25, 2019 at 03:43:18PM +0900, Michael Paquier wrote: > Attached is a patch to fix the issue. As we know that the old index > will have a definition and dependencies that match with the old one, I > think that we should just remove any dependency records on the new > index before moving the new set of dependencies from the old to the > new index. The patch includes regression tests that scan pg_depend to > check that everything remains consistent after REINDEX CONCURRENTLY. > > Any thoughts? I have done more tests for this one through the day, and committed the patch. There is still one bug pending related to partitioned indexes where REINDEX CONCURRENTLY is cancelled after phase 4 (swap) has committed. I am still looking more into that. -- Michael signature.asc Description: PGP signature
Re: [DOC] Fix for the missing pg_stat_progress_cluster view phase column value
On Mon, Oct 28, 2019 at 02:26:39PM +0900, Kyotaro Horiguchi wrote: > By the way the table mentions the phases common to CLUSTER and > VACUUM FULL. I wonder why some of them are described as "CLUSTER is" > and others are "The command is".. Because VACUUM FULL does not use the sort-and-scan mode, no? -- Michael signature.asc Description: PGP signature
Re: WIP: expression evaluation improvements
Hi Andres, Apologies, I realize my understanding of symbol resolution and the referenced_functions mechanism wasn't correct. Thank you for your very helpful explanations. > There's also a related edge-case where are unable to figure out a symbol > name in llvm_function_reference(), and then resort to creating a global > variable pointing to the function. Indeed. > If indeed the only case this is being hit is language PL handlers, it > might be better to instead work out the symbol name for that handler - > we should be able to get that via pg_language.lanplcallfoid. I took a stab at this (on top of your patch set): v1-0001-Resolve-PL-handler-names-for-JITed-code-instead-o.patch > Which cases are you talking about here? Because I don't think there's > any others where would know a symbol name to add to referenced_functions > in the first place? I had misunderstood the intent of referenced_functions. > I do want to benefit from getting accurate signatures for patch > [PATCH v2 26/32] WIP: expression eval: relative pointer suppport > I had a number of cases where I passed the wrong parameters, and llvm > couldn't tell me... I took a stab: v1-0001-Rely-on-llvmjit_types-for-building-EvalFunc-calls.patch On a separate note, I had submitted a patch earlier to optimize functions earlier in accordance to the code comment: /* * Do function level optimization. This could be moved to the point where * functions are emitted, to reduce memory usage a bit. */ LLVMInitializeFunctionPassManager(llvm_fpm); Refer: https://www.postgresql.org/message-id/flat/cae-ml+_oe4-shvn0aa_qakc5qkzvqvainxwb1ztuut67spm...@mail.gmail.com I have rebased that patch on top of your patch set. Here it is: v2-0001-Optimize-generated-functions-earlier-to-lower-mem.patch -- Soumyadeep v2-0001-Optimize-generated-functions-earlier-to-lower-mem.patch Description: Binary data v1-0001-Rely-on-llvmjit_types-for-building-EvalFunc-calls.patch Description: Binary data v1-0001-Resolve-PL-handler-names-for-JITed-code-instead-o.patch Description: Binary data
Re: [HACKERS] Block level parallel vacuum
On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar wrote: > > On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada wrote: > > > > > I haven't yet read the new set of the patch. But, I have noticed one > thing. That we are getting the size of the statistics using the AM > routine. But, we are copying those statistics from local memory to > the shared memory directly using the memcpy. Wouldn't it be a good > idea to have an AM specific routine to get it copied from the local > memory to the shared memory? I am not sure it is worth it or not but > my thought behind this point is that it will give AM to have local > stats in any form ( like they can store a pointer in that ) but they > can serialize that while copying to shared stats. And, later when > shared stats are passed back to the Am then it can deserialize in its > local form and use it. > You have a point, but after changing the gist index, we don't have any current usage for indexes that need something like that. So, on one side there is some value in having an API to copy the stats, but on the other side without having clear usage of an API, it might not be good to expose a new API for the same. I think we can expose such an API in the future if there is a need for the same. Do you or anyone know of any external IndexAM that has such a need? Few minor comments while glancing through the latest patchset. 1. I think you can merge 0001*, 0002*, 0003* patch into one patch as all three expose new variable/function from IndexAmRoutine. 2. +prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes) +{ + char *p = (char *) GetSharedIndStats(lvshared); + int vac_work_mem = IsAutoVacuumWorkerProcess() && + autovacuum_work_mem != -1 ? + autovacuum_work_mem : maintenance_work_mem; I think this function won't be called from AutoVacuumWorkerProcess at least not as of now, so isn't it a better idea to have an Assert for it? 3. +void +heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc) This function is for performing a parallel operation on the index, so why to start with heap? It is better to name it as index_parallel_vacuum_main or simply parallel_vacuum_main. 4. /* useindex = true means two-pass strategy; false means one-pass */ @@ -128,17 +280,12 @@ typedef struct LVRelStats BlockNumber pages_removed; double tuples_deleted; BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */ - /* List of TIDs of tuples we intend to delete */ - /* NB: this list is ordered by TID address */ - int num_dead_tuples; /* current # of entries */ - int max_dead_tuples; /* # slots allocated in array */ - ItemPointer dead_tuples; /* array of ItemPointerData */ + LVDeadTuples *dead_tuples; int num_index_scans; TransactionId latestRemovedXid; bool lock_waiter_detected; } LVRelStats; - /* A few variables that don't seem worth passing around as parameters */ static int elevel = -1; It seems like a spurious line removal. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com