Re: [HACKERS] Optional message to user when terminating/cancelling backend
2018-08-12 0:17 GMT+02:00 Daniel Gustafsson : > > On 6 Aug 2018, at 09:47, Heikki Linnakangas wrote: > > > > Has there been any consideration to encodings? > > Thats a good point, no =/ > > > What happens if the message contains non-ASCII characters, and the > sending backend is connected to database that uses a different encoding > than the backend being signaled? > > In the current state of the patch, instead of the message you get: > > FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" > has >no equivalent in encoding “ISO_8859_5" > Where this code fails? Isn't somewhere upper where string literals are translated? Then this message is ok. > > Thats clearly not good enough, but I’m not entirely sure what would be the > best > way forward. Restrict messages to only be in SQL_ASCII? Store the > encoding of > the message and check the encoding of the receiving backend before issuing > it > for a valid conversion, falling back to no message in case there is none? > Neither seems terribly appealing, do you have any better suggestions? > The client<->server encoding translation should do this work no? Regards Pavel > cheers ./daniel
Tid scan improvements
Hello, To scratch an itch, I have been working on teaching TidScan how to do range queries, i.e. those using >=, <, BETWEEN, etc. This means we can write, for instance, SELECT * FROM t WHERE ctid >= '(1000,0)' AND ctid < '(2000,0)'; instead of resorting to the old trick: SELECT * FROM t WHERE ctid = ANY (ARRAY(SELECT format('(%s,%s)', i, j)::tid FROM generate_series(1000,1999) AS gs(i), generate_series(1,200) AS gs2(j))); where "200" is some guess at how many tuples can fit on a page for that table. There's some previous discussion about this at https://www.postgresql.org/message-id/flat/CAHyXU0zJhg_5RtxKnNbAK%3D4ZzQEFUFi%2B52RjpLrxtkRTD6CDFw%40mail.gmail.com#3ba2c3a6be217f40130655a3250d80a4 . Since range scan execution is rather different from the existing TidScan execution, I ended up making a new plan type, TidRangeScan. There is still only one TidPath, but it has an additional member that describes which method to use. As part of the work I also taught TidScan that its results are ordered by ctid, i.e. to set a pathkey on a TidPath. The benefit of this is that queries such as SELECT MAX(ctid) FROM t; SELECT * FROM t WHERE ctid IN (...) ORDER BY ctid; are now planned a bit more efficiently. Execution was already returning tuples in ascending ctid order; I just had to add support for descending order. Attached are a couple of patches: - 01_tid_scan_ordering.patch - 02_tid_range_scan.patch, to be applied on top of 01. Can I add this to the next CommitFest? Obviously the whole thing needs thorough review, and I expect there to be numerous problems. (I had to make this prototype to demonstrate to myself that it wasn't completely beyond me. I know from experience how easy it is to enthusiastically volunteer something for an open source project, discover that one does not have the time or skill required, and be too embarrassed to show one's face again!) As well as actual correctness, some aspects that I am particularly unsure about include: - Is it messy to use TidPath for both types of scan? - What is the planning cost for plans that don't end up being a TidScan or TidRangeScan? - Have I put the various helper functions in the right files? - Is there a less brittle way to create tables of a specific number of blocks/tuples in the regression tests? - Have a got the ScanDirection right during execution? - Are my changes to heapam ok? Cheers, Edmund 01_tid_scan_ordering.patch Description: Binary data 02_tid_range_scan.patch Description: Binary data
Re: Allowing printf("%m") only where it actually works
I wrote: > At this point I'm inclined to push both of those patches so we can > see what the buildfarm makes of them. So I did that, and while not all of the buildfarm has reported in, enough of it has that I think we can draw conclusions. The only member that's showing any new warnings, AFAICT, is jacana (gcc 4.9 on Windows). It had no format-related warnings yesterday, but now it has a boatload of 'em, and it appears that every single one traces to not believing that printf and friends understand 'l' and 'z' length modifiers. The reason for this seems to be that we unconditionally replace the printf function family with snprintf.c on Windows, and port.h causes those functions to be marked with pg_attribute_printf, which this patch caused to mean just "printf" not "gnu_printf". So this gcc evidently thinks the platform printf doesn't know 'l' and 'z' (which may or may not be true in reality, but it's irrelevant) and it complains. There are also interesting warnings showing up in elog.c, such as Aug 11 14:26:32 c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:807:2: warning: function might be possible candidate for 'ms_printf' format attribute [-Wsuggest-attribute=format] I think what is happening here is that gcc notices that those functions call appendStringInfoVA, which is now annotated with the printf archetype not gnu_printf, so it decides that maybe we marked the elog.c functions with the wrong archetype. I have no idea why it's suggesting "ms_printf" though --- I can find nothing on the web that even admits that gcc knows such an archetype. So this is all pretty much of a mess. If we annotate the elog functions differently from printf's annotation then we risk getting these complaints in elog.c, but if we don't do that then we can't really describe their semantics correctly. We could possibly mark the replacement snprintf functions with gnu_printf, but that's a lie with respect to the very point at issue about %m. Unless we were to teach snprintf.c about %m ... which probably wouldn't be hard, but I'm not sure I want to go there. That line of thought leads to deciding that we should treat "printf doesn't know %m" as a reason to use snprintf.c over the native printf; and I think we probably do not want to do that, if only because the native printf is probably more efficient than snprintf.c. (There are other reasons to question that too: we probably can't tell without a run-time test in configure, and even if we detect it correctly, gcc might be misconfigured to believe the opposite thing about %m support and hence warn, or fail to warn, anyway. clang at least seems to get this wrong frequently.) But if we do not do such replacement then we still end up wondering how to mark printf wrapper functions such as appendStringInfoVA. At this point I'm inclined to give up and revert 3a60c8ff8. It's not clear that we can really make the warning situation better, as opposed to just moving the warnings from one platform to another. regards, tom lane
Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept
Hi! On Fri, Aug 10, 2018 at 5:07 PM Alexander Korotkov wrote: > On Thu, Aug 9, 2018 at 11:26 PM Ivan Kartyshov > wrote: > > Alexander Korotkov писал 2018-06-20 20:54: > > > Thinking about that more I found that adding vacuum mark as an extra > > > argument to LockAcquireExtended is also wrong. It would be still hard > > > to determine if we should log the lock in LogStandbySnapshot(). We'll > > > have to add that flag to shared memory table of locks. And that looks > > > like a kludge. > > > > > > Therefore, I'd like to propose another approach: introduce new lock > > > level. So, AccessExclusiveLock will be split into two > > > AccessExclusiveLocalLock and AccessExclusiveLock. In spite of > > > AccessExclusiveLock, AccessExclusiveLocalLock will be not logged to > > > standby, and used for heap truncation. > > > > > > I expect some resistance to my proposal, because fixing this "small > > > bug" doesn't deserve new lock level. But current behavior of > > > hot_standby_feedback is buggy. From user prospective, > > > hot_standby_feedback option is just non-worker, which causes master to > > > bloat without protection for standby queries from cancel. We need to > > > fix that, but I don't see other proper way to do that except adding > > > new lock level... > > > > Your offer is very interesting, it made patch smaller and more > > beautiful. > > So I update patch and made changes accordance with the proposed concept > > of > > special AccessExclusiveLocalLock. > > > I would like to here you opinion over this implementation. > > In general this patch looks good for me. It seems that comments and > documentation need minor enhancements. I'll make them and post the > revised patch. Find the revised patch attached. It contains some enhancements in comments, formatting and documentation. BTW, I decided that we should enumerate ACCESS EXCLUSIVE LOCAL lock before ACCESS EXCLUSIVE lock, because we enumerate lock on ascending strength. So, since ACCESS EXCLUSIVE is WAL-logged, it's definitely "stronger". I think that introduction of new lock level is significant change and can't be backpatched. But I think it worth to backpatch a note to the documentation, which clarifies why hot_standby_feedback might have unexpected behavior. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company hsfeedback_locallock-2.patch Description: Binary data hsfeedback_backpatch_note.patch Description: Binary data
Re: NetBSD vs libxml2
On 08/11/2018 01:18 PM, Tom Lane wrote: In a moment of idle curiosity, I tried to build PG --with-libxml on NetBSD-current (well, mostly current, from May or so). The configure script blew up, complaining that it couldn't execute a test program. Investigation showed that xml2-config reports this: $ xml2-config --libs -Wl,-R/usr/pkg/lib -L/usr/pkg/lib -lxml2 -L/usr/lib -lz -L/usr/lib -llzma -L/usr/lib -lm and we're only paying attention to the -L switches out of that. So we successfully link to /usr/pkg/lib/libxml2.so, but then execution fails for lack of rpath pointing at /usr/pkg/lib. We could fix this by teaching configure to absorb -Wl,-R... switches into LDFLAGS from xml2-config's output, and that seems to make things work, but I wonder whether we should or not. This seems like a new height of unfriendliness to non-default packages on NetBSD's part, and it's a bit hard to believe the behavior will make it to a formal release. I don't see any comparable behavior on FreeBSD for instance --- it puts packages' libraries into /usr/local/lib, but that seems to be searched automatically without additional switches beyond -L. Don't have an easy way to check things on OpenBSD. OpenBSD-6.3's "xml2-config --libs" doesn't contain any rpath settings, it's pretty much like the one on my (fairly old) FBSD machine, as you describ4ed above. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 6 Aug 2018, at 09:47, Heikki Linnakangas wrote: > > Has there been any consideration to encodings? Thats a good point, no =/ > What happens if the message contains non-ASCII characters, and the sending > backend is connected to database that uses a different encoding than the > backend being signaled? In the current state of the patch, instead of the message you get: FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has no equivalent in encoding “ISO_8859_5" Thats clearly not good enough, but I’m not entirely sure what would be the best way forward. Restrict messages to only be in SQL_ASCII? Store the encoding of the message and check the encoding of the receiving backend before issuing it for a valid conversion, falling back to no message in case there is none? Neither seems terribly appealing, do you have any better suggestions? cheers ./daniel
Re: Facility for detecting insecure object naming
On Sat, Aug 11, 2018 at 03:32:23PM -0500, Nico Williams wrote: > On Sat, Aug 11, 2018 at 12:47:05PM -0700, Noah Misch wrote: > > -- (3) "SET search_path" with today's code. > > -- > > -- Security and reliability considerations are the same as (2). Today, this > > -- reduces performance by suppressing optimizations like inlining. > > Out of curiosity, why does this suppress inlining? The function call machinery, fmgr_security_definer(), is what actually applies the setting. To inline such functions, one would need to develop a representation that attaches the setting change to the nodes resulting from the inlining. When evaluating said nodes, apply the attached setting. > Anyways, my preference would be to have syntax by which to say: resolve > at declaration time using the then-in-effect search_path and store > as-qualified. Agreed; having that would be great. (I mentioned it as option (7) of https://postgr.es/m/20180710014308.ga805...@rfd.leadboat.com.) It has limitations, though: - Does not help with inexact argument type matches. - While the applicability to sql-language functions seems clear, other languages don't benefit as much. You might extend it to a subset of PL/pgSQL functions, excluding e.g. ones that contain EXECUTE. I see no chance to help PL/Perl or PL/Python. - Unlikely to be a good back-patch candidate. > Another possibility would be to have a way to set a search_path for all > expressions in a given schema, something like: > > SET SCHEMA my_schema DEFAULT search_path = ...; > > which would apply to all expressions in schema elements in schema > "my_schema": > > - CHECK expressions > - INDEX expressions > - VIEWs and MATERIALIZED VIEWs > - FUNCTION and STORED PROCEDURE bodies > - ... > > CREATE SCHEMA IF NOT EXISTS my_schema; > > SET SCHEMA my_schema DEFAULT search_path = my_schema, my_other_schema; > > CREATE OR REPLACE FUNCTION foo() ... AS $$ ... $$; That's interesting. I suspect we'd first want to make inlining possible.
Re: NetBSD vs libxml2
On Sat, Aug 11, 2018 at 01:12:09PM -0500, Nico Williams wrote: > I'm willing to write a patch after lunch. In ./configure.in this: > > for pgac_option in `$XML2_CONFIG --libs`; do > case $pgac_option in > -L*) LDFLAGS="$LDFLAGS $pgac_option";; > esac > done > > should change to: > > for pgac_option in `$XML2_CONFIG --libs`; do > case $pgac_option in > -l*) LDLIBS="$LDLIBS $pgac_option";; > *) LDFLAGS="$LDFLAGS $pgac_option";; > esac > done > > More changes are needed to stop hard-coding -lxml2. Actually, no, no more changes are needed. The -lxml2 comes from: 1193 if test "$with_libxml" = yes ; then 1194 AC_CHECK_LIB(xml2, xmlSaveToBuffer, [], [AC_MSG_ERROR([library 'xml2' (version >= 2.6.23) is required for XML support])]) 1195 fi in configure.in, and I think contrib/xml2/Makefile needs to hardcode it. So the above quoted change is all that is needed, plus re-run autoconf. See attached. (I'm not including unrelated changes made to configure by autoconf.) Nico -- >From 995ee1dbbc0ee9af7bbb24728f09ec022696bba0 Mon Sep 17 00:00:00 2001 From: Nicolas Williams Date: Sat, 11 Aug 2018 15:52:19 -0500 Subject: [PATCH] Keep rpath et. al. from xml2-config --libs Instead of looking for -L* words to put in to LDFLAGS and everthing else into LIBS, look for -l* words to put into LIBS and everything else into LDFLAGS. --- configure| 3 ++- configure.in | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 2665213..210f739 100755 --- a/configure +++ b/configure @@ -8009,7 +8009,8 @@ fi done for pgac_option in `$XML2_CONFIG --libs`; do case $pgac_option in --L*) LDFLAGS="$LDFLAGS $pgac_option";; +-l*) LIBS="$LIBS $pgac_option";; +*) LDFLAGS="$LDFLAGS $pgac_option";; esac done fi diff --git a/configure.in b/configure.in index 397f6bc..b3f5c6c 100644 --- a/configure.in +++ b/configure.in @@ -902,7 +902,8 @@ if test "$with_libxml" = yes ; then done for pgac_option in `$XML2_CONFIG --libs`; do case $pgac_option in --L*) LDFLAGS="$LDFLAGS $pgac_option";; +-l*) LIBS="$LIBS $pgac_option";; +*) LDFLAGS="$LDFLAGS $pgac_option";; esac done fi -- 2.7.4
Re: Facility for detecting insecure object naming
On Wed, Aug 08, 2018 at 10:47:04AM -0400, Tom Lane wrote: > Isaac Morland writes: > > While I'm asking, does anybody know why this isn't the default, especially > > for SECURITY DEFINER functions? > > It might fix some subset of security issues, but I think that changing > the default behavior like that would break a bunch of other use-cases. > It'd be especially surprising for such a thing to apply only to > SECURITY DEFINER functions. Some projects consider breaking backwards compatibility to fix security problems (within reason, and with discussion) to be a fair thing to do. Already people have to qualify their apps for every release of PG. I think this problem very much deserves a good solution. Nico --
Re: Facility for detecting insecure object naming
On Sat, Aug 11, 2018 at 12:47:05PM -0700, Noah Misch wrote: > -- (3) "SET search_path" with today's code. > -- > -- Security and reliability considerations are the same as (2). Today, this > -- reduces performance by suppressing optimizations like inlining. Out of curiosity, why does this suppress inlining? Anyways, my preference would be to have syntax by which to say: resolve at declaration time using the then-in-effect search_path and store as-qualified. This could just be SET search_path without an assignment. CREATE FUNCTION ... AS $$ ... $$ SET search_path; Another possibility would be to have a way to set a search_path for all expressions in a given schema, something like: SET SCHEMA my_schema DEFAULT search_path = ...; which would apply to all expressions in schema elements in schema "my_schema": - CHECK expressions - INDEX expressions - VIEWs and MATERIALIZED VIEWs - FUNCTION and STORED PROCEDURE bodies - ... CREATE SCHEMA IF NOT EXISTS my_schema; SET SCHEMA my_schema DEFAULT search_path = my_schema, my_other_schema; CREATE OR REPLACE FUNCTION foo() ... AS $$ ... $$; ... Nico --
Re: Facility for detecting insecure object naming
On Wed, Aug 08, 2018 at 09:58:38AM -0400, Tom Lane wrote: > I'm not sold on #2 either. That path leads to, for example, > s/=/OPERATOR(pg_catalog.=)/g everywhere, which is utterly catastrophic > to both readability and portability of your SQL code. We *must* find > a way to do better, not tell people that's what to do. > > When the security team was discussing this issue before, we speculated > about ideas like inventing a function trust mechanism, so that attacks > based on search path manipulations would fail even if they managed to > capture an operator reference. I'd rather go down that path than > encourage people to do more schema qualification. Interesting. If we got a function trust mechanism, how much qualification would you then like? Here are the levels I know about, along with their implications: -- (1) Use qualified references and exact match for all objects. -- -- Always secure, even if schema usage does not conform to ddl-schemas-patterns -- and function trust is disabled. -- -- Subject to denial of service from anyone able to CREATE in cube schema or -- earthdistance schema. CREATE FUNCTION latitude(earth) RETURNS float8 LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS $$SELECT CASE WHEN @cube_schema@.cube_ll_coord($1::@cube_schema@.cube, 3) OPERATOR(pg_catalog./) @extschema@.earth() OPERATOR(pg_catalog.<) -1 THEN -90::pg_catalog.float8 WHEN @cube_schema@.cube_ll_coord($1::@cube_schema@.cube, 3) OPERATOR(pg_catalog./) @extschema@.earth() OPERATOR(pg_catalog.>) 1 THEN 90::pg_catalog.float8 ELSE pg_catalog.degrees(pg_catalog.asin(@cube_schema@.cube_ll_coord( $1::@cube_schema@.cube, 3) OPERATOR(pg_catalog./) @extschema@.earth())) END$$; -- (2) Use qualified references for objects outside pg_catalog. -- -- With function trust disabled, this would be subject to privilege escalation -- from anyone able to CREATE in cube schema. -- -- Subject to denial of service from anyone able to CREATE in cube schema or -- earthdistance schema. CREATE FUNCTION latitude(earth) RETURNS float8 LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS $$SELECT CASE WHEN @cube_schema@.cube_ll_coord($1, 3) / @extschema@.earth() < -1 THEN -90::float8 WHEN @cube_schema@.cube_ll_coord($1, 3) / @extschema@.earth() > 1 THEN 90::float8 ELSE degrees(asin(@cube_schema@.cube_ll_coord($1, 3) / @extschema@.earth())) END$$; -- (3) "SET search_path" with today's code. -- -- Security and reliability considerations are the same as (2). Today, this -- reduces performance by suppressing optimizations like inlining. CREATE FUNCTION latitude(earth) RETURNS float8 LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE SET search_path FROM CURRENT AS $$SELECT CASE WHEN cube_ll_coord($1, 3) / earth() < -1 THEN -90::float8 WHEN cube_ll_coord($1, 3) / earth() > 1 THEN 90::float8 ELSE degrees(asin(cube_ll_coord($1, 3) / earth())) END$$; -- (4) Today's code (reformatted). -- -- Always secure if schema usage conforms to ddl-schemas-patterns, even if -- function trust is disabled. If cube schema or earthdistance schema is not in -- search_path, function doesn't work. CREATE FUNCTION latitude(earth) RETURNS float8 LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS $$SELECT CASE WHEN cube_ll_coord($1, 3) / earth() < -1 THEN -90::float8 WHEN cube_ll_coord($1, 3) / earth() > 1 THEN 90::float8 ELSE degrees(asin(cube_ll_coord($1, 3) / earth())) END$$;
Re: [sqlsmith] ERROR: plan should not reference subplan's variable
Andreas Seltenreich writes: > sqlsmith caused another internal error while testing REL_11_STABLE at > 1b9d1b08fe. The query below on the regression DB yields "plan should > not reference subplan's variable" for me. Thanks for the report. Seems to be wrong order of operations in inheritance_planner: it shouldn't be appending extra SUBQUERY_RTE copies to a subroot that it's going to use as a parent for later iterations. That results in too many copies of the subqueries and screwed-up RTE numbering in the children. regards, tom lane
Re: NetBSD vs libxml2
On Sat, Aug 11, 2018 at 01:18:26PM -0400, Tom Lane wrote: > In a moment of idle curiosity, I tried to build PG --with-libxml > on NetBSD-current (well, mostly current, from May or so). > The configure script blew up, complaining that it couldn't execute > a test program. Investigation showed that xml2-config reports this: > > $ xml2-config --libs > -Wl,-R/usr/pkg/lib -L/usr/pkg/lib -lxml2 -L/usr/lib -lz -L/usr/lib -llzma > -L/usr/lib -lm > > and we're only paying attention to the -L switches out of that. > So we successfully link to /usr/pkg/lib/libxml2.so, but then > execution fails for lack of rpath pointing at /usr/pkg/lib. > > We could fix this by teaching configure to absorb -Wl,-R... switches > into LDFLAGS from xml2-config's output, and that seems to make things > work, but I wonder whether we should or not. This seems like a new height > of unfriendliness to non-default packages on NetBSD's part, and it's a bit > hard to believe the behavior will make it to a formal release. I don't > see any comparable behavior on FreeBSD for instance --- it puts packages' > libraries into /usr/local/lib, but that seems to be searched automatically > without additional switches beyond -L. Don't have an easy way to check > things on OpenBSD. > > Thoughts? -Wl,-R (and friends, like -Wl,-rpath) is part and parcel of dynamic linking, and are precious. All software should honor these. That non-default packages don't end up in /usr is a perfectly legitimate thing for a distribution to do. More importantly, a site-build that uses a non-system location for e.g. locally-patched open source packages, is a perfectly legitimate thing for _users_ to do. It isn't nice to force them to hack a project's ./configure file or work out precious envvar settings to make that project support non-system locations for dependencies. I guess the problem here is that xml2-config is (like so many *-config programs) broken by not having a way to get ld flags and libs separately... Which would be why ./configure is parsing the output of xml2-config --libs. The better thing to do would be to take all the words from xml2-config --libs that match -l* and put them in LIBS while all others go into LDFLAGS. It's safer to assume that -l* means "link this in" than that there won't be new linker options other than -L or -l. I'll note too that -lxml2 is hardcoded in ./configure.in. That's not right either. IMO this is just a minor bug in PG. I'm willing to write a patch after lunch. In ./configure.in this: for pgac_option in `$XML2_CONFIG --libs`; do case $pgac_option in -L*) LDFLAGS="$LDFLAGS $pgac_option";; esac done should change to: for pgac_option in `$XML2_CONFIG --libs`; do case $pgac_option in -l*) LDLIBS="$LDLIBS $pgac_option";; *) LDFLAGS="$LDFLAGS $pgac_option";; esac done More changes are needed to stop hard-coding -lxml2. I can write a patch if you like. Nico --
NetBSD vs libxml2
In a moment of idle curiosity, I tried to build PG --with-libxml on NetBSD-current (well, mostly current, from May or so). The configure script blew up, complaining that it couldn't execute a test program. Investigation showed that xml2-config reports this: $ xml2-config --libs -Wl,-R/usr/pkg/lib -L/usr/pkg/lib -lxml2 -L/usr/lib -lz -L/usr/lib -llzma -L/usr/lib -lm and we're only paying attention to the -L switches out of that. So we successfully link to /usr/pkg/lib/libxml2.so, but then execution fails for lack of rpath pointing at /usr/pkg/lib. We could fix this by teaching configure to absorb -Wl,-R... switches into LDFLAGS from xml2-config's output, and that seems to make things work, but I wonder whether we should or not. This seems like a new height of unfriendliness to non-default packages on NetBSD's part, and it's a bit hard to believe the behavior will make it to a formal release. I don't see any comparable behavior on FreeBSD for instance --- it puts packages' libraries into /usr/local/lib, but that seems to be searched automatically without additional switches beyond -L. Don't have an easy way to check things on OpenBSD. Thoughts? regards, tom lane
[sqlsmith] ERROR: plan should not reference subplan's variable
Hi, sqlsmith caused another internal error while testing REL_11_STABLE at 1b9d1b08fe. The query below on the regression DB yields "plan should not reference subplan's variable" for me. regards, Andreas delete from public.prt1_l where EXISTS ( select from public.xmltest as ref_10 , lateral (select ref_10.data as c0 from public.radix_text_tbl as ref_0, lateral (select ref_11.name as c0 from public.equipment_r as ref_11 limit 134) as subq_0 limit 110) as subq_1 where public.prt1_l.c is NULL) returning 42;
Re: libpq connection timeout mismanagement
Hello Tom, The patch that taught libpq about allowing multiple target hosts modified connectDBComplete() with the intent of making the connect_timeout (if specified) apply per-host, not to the complete connection attempt. It did not do a very good job though, because the timeout only gets reset when connectDBComplete() itself detects a timeout. If PQconnectPoll advances to a new host due to some other cause, the previous host's timeout continues to run, possibly causing a premature timeout failure for the new one. Another thing that I find pretty strange is that it is coded so that, in event of a timeout detection by connectDBComplete, we give up on the current connhost entry and advance to the next host, ignoring any additional addresses we might have for the current hostname. This seems at best poorly thought through. There's no good reason for libpq to assume that all the addresses returned by DNS point at the same machine, or share the same network failure points in between. Attached is an attempt to improve this. I'll park it in the Sept fest. Patch does not "git apply", but is ok with "patch -p1". Compiles. Global "make check" is okay. Feature works for me, i.e. each target host seems to get at least its timeout. Code looks straightforward enough. In passing: The code suggests that timeout is always 2 or more if (timeout < 2) timeout = 2; but doc says "It is not recommended to use a timeout of less than 2 seconds", which is inconsistent. It should read "Timeout is always set to at least 2 whatever the user asks.". connect_timeout=2.9 is accepted and considered as meaning 2. connect_timeout=-10 or connect_timeout=two are also accepted and mean forever. Probably thanks to "atoi". -- Fabien.
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 08/11/2018 04:15 PM, Tom Lane wrote: > Tomas Vondra writes: On 08/09/2018 07:47 PM, Alvaro Herrera wrote: > Actually, it seems to me that ApplyLogicalMappingFile is just leaking > the file descriptor for no good reason. > >> I think the fix can be as simple as attached ... I'm mostly afk for the >> weekend, so I'll commit & backpatch on Monday or so. > > LGTM. While you're at it, would you fix the misspelling three lines > below this? > > * Check whether the TransactionOId 'xid' is in the pre-sorted array 'xip'. > ^ > Sure. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
On 08/11/2018 04:08 PM, Andres Freund wrote: > Hi, > > On 2018-08-11 15:40:19 +0200, Tomas Vondra wrote: >> For the record, I can actually reproduce this on 9.6 (haven't tried >> older releases, but I suspect it's there too). Instead of using the >> failing subscription, I've used another pgbench script doing this: > >> SET statement_timeout = 5; >> COPY t TO '/dev/null'; >> >> and doing something like: >> >>pgbench -n -c 20 -T 300 -f copy.sql test > > Just to confirm: That's with the vacuum full and insert running > concurrently? And then just restarting the above copy.sql (as pgbench > errors out after the timeouts) until you get the error? > Yes, pretty much. > I'm a bit confused what the copy + timeout is doing here? It shouldn't > trigger any invalidations itself, and the backtrace appears to be from > the insert.sql you posted earlier? Unclear why a copy to /dev/null > should trigger anything like this? > My goal was to simulate the failing subscription sync, which does COPY and fails because of duplicate data. The statement_timeout seemed like a good approximation of that. It may be unnecessary, I don't know. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: logical decoding / rewrite map vs. maxAllocatedDescs
Tomas Vondra writes: >>> On 08/09/2018 07:47 PM, Alvaro Herrera wrote: Actually, it seems to me that ApplyLogicalMappingFile is just leaking the file descriptor for no good reason. > I think the fix can be as simple as attached ... I'm mostly afk for the > weekend, so I'll commit & backpatch on Monday or so. LGTM. While you're at it, would you fix the misspelling three lines below this? * Check whether the TransactionOId 'xid' is in the pre-sorted array 'xip'. ^ regards, tom lane
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
Hi, On 2018-08-11 15:40:19 +0200, Tomas Vondra wrote: > For the record, I can actually reproduce this on 9.6 (haven't tried > older releases, but I suspect it's there too). Instead of using the > failing subscription, I've used another pgbench script doing this: > SET statement_timeout = 5; > COPY t TO '/dev/null'; > > and doing something like: > >pgbench -n -c 20 -T 300 -f copy.sql test Just to confirm: That's with the vacuum full and insert running concurrently? And then just restarting the above copy.sql (as pgbench errors out after the timeouts) until you get the error? I'm a bit confused what the copy + timeout is doing here? It shouldn't trigger any invalidations itself, and the backtrace appears to be from the insert.sql you posted earlier? Unclear why a copy to /dev/null should trigger anything like this? Greetings, Andres Freund
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 08/10/2018 11:13 PM, Andres Freund wrote: > On 2018-08-10 22:57:57 +0200, Tomas Vondra wrote: >> >> >> On 08/09/2018 07:47 PM, Alvaro Herrera wrote: >>> On 2018-Aug-09, Tomas Vondra wrote: >>> I suppose there are reasons why it's done this way, and admittedly the test that happens to trigger this is a bit extreme (essentially running pgbench concurrently with 'vacuum full pg_class' in a loop). I'm not sure it's extreme enough to deem it not an issue, because people using many temporary tables often deal with bloat by doing frequent vacuum full on catalogs. >>> >>> Actually, it seems to me that ApplyLogicalMappingFile is just leaking >>> the file descriptor for no good reason. There's a different >>> OpenTransientFile call in ReorderBufferRestoreChanges that is not >>> intended to be closed immediately, but the other one seems a plain bug, >>> easy enough to fix. >>> >> >> Indeed. Adding a CloseTransientFile to ApplyLogicalMappingFile >> solves the issue with hitting maxAllocatedDecs. Barring objections >> I'll commit this shortly. > > Yea, that's clearly a bug. I've not seen a patch, so I can't quite > formally sign off, but it seems fairly obvious. > I think the fix can be as simple as attached ... I'm mostly afk for the weekend, so I'll commit & backpatch on Monday or so. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 47e669578f..1a1a0fa469 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -3327,6 +3327,8 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname) new_ent->combocid = ent->combocid; } } + + CloseTransientFile(fd); }
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
On 08/11/2018 03:16 PM, Tomas Vondra wrote: > On 08/11/2018 05:02 AM, Tom Lane wrote: >> Peter Geoghegan writes: >>> I'm concerned that this item has the potential to delay the release, >>> since, as you said, we're back to the drawing board. >> >> Me too. I will absolutely not vote to release 11.0 before we've >> solved this ... >> > > Not sure. I can easily reproduce that on REL_10_STABLE, so it's not a > new issue specific to 11.0. > For the record, I can actually reproduce this on 9.6 (haven't tried older releases, but I suspect it's there too). Instead of using the failing subscription, I've used another pgbench script doing this: SET statement_timeout = 5; COPY t TO '/dev/null'; and doing something like: pgbench -n -c 20 -T 300 -f copy.sql test I don't think the 20-client COPY concurrency is necessary, it just makes restarts after the statement_timeout faster. This produces about the same backtrace as reported on the other thread. Attaches is both plain 'bt' and 'bt full'. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Core was generated by `postgres: user test [local] INSERT '. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 } (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x778bc7f4b800 in __GI_abort () at abort.c:89 #2 0x00962df3 in errfinish (dummy=0) at elog.c:557 #3 0x008091b9 in mdread (reln=0x161a940, forknum=MAIN_FORKNUM, blocknum=1, buffer=0x778bb9a38080 "") at md.c:784 #4 0x0080b8bd in smgrread (reln=0x161a940, forknum=MAIN_FORKNUM, blocknum=1, buffer=0x778bb9a38080 "") at smgr.c:628 #5 0x007cf93f in ReadBuffer_common (smgr=0x161a940, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=1, mode=RBM_NORMAL, strategy=0x0, hit=0x7fff20f9131b "") at bufmgr.c:890 #6 0x007cf272 in ReadBufferExtended (reln=0x15f2318, forkNum=MAIN_FORKNUM, blockNum=1, mode=RBM_NORMAL, strategy=0x0) at bufmgr.c:664 #7 0x007cf14e in ReadBuffer (reln=0x15f2318, blockNum=1) at bufmgr.c:596 #8 0x004e0953 in _bt_getbuf (rel=0x15f2318, blkno=1, access=1) at nbtpage.c:576 #9 0x004df73a in _bt_getroot (rel=0x15f2318, access=1) at nbtpage.c:132 #10 0x004e6adb in _bt_search (rel=0x15f2318, keysz=1, scankey=0x7fff20f91dd0, nextkey=0 '\000', bufP=0x7fff20f927d4, access=1, snapshot=0xe45920 ) at nbtsearch.c:99 #11 0x004e80a9 in _bt_first (scan=0x16454b8, dir=ForwardScanDirection) at nbtsearch.c:983 #12 0x004e4eb5 in btgettuple (scan=0x16454b8, dir=ForwardScanDirection) at nbtree.c:326 #13 0x004d9321 in index_getnext_tid (scan=0x16454b8, direction=ForwardScanDirection) at indexam.c:415 #14 0x004d9744 in index_getnext (scan=0x16454b8, direction=ForwardScanDirection) at indexam.c:553 #15 0x004d81c3 in systable_getnext (sysscan=0x16455d0) at genam.c:416 #16 0x0094c114 in ScanPgRelation (targetRelId=16391, indexOK=1 '\001', force_non_historic=0 '\000') at relcache.c:347 #17 0x0094f6a3 in RelationReloadIndexInfo (relation=0x778bc8bdcfc8) at relcache.c:1936 #18 0x0094fd8e in RelationClearRelation (relation=0x778bc8bdcfc8, rebuild=1 '\001') at relcache.c:2199 #19 0x0095023d in RelationFlushRelation (relation=0x778bc8bdcfc8) at relcache.c:2403 #20 0x00950351 in RelationCacheInvalidateEntry (relationId=16391) at relcache.c:2455 #21 0x0094865e in LocalExecuteInvalidationMessage (msg=0x7fff20f92cc0) at inval.c:578 #22 0x007eaf7e in ReceiveSharedInvalidMessages (invalFunction=0x94856d , resetFunction=0x9487b3 ) at sinval.c:91 #23 0x0094887d in AcceptInvalidationMessages () at inval.c:672 #24 0x007eee5c in LockRelationOid (relid=2662, lockmode=1) at lmgr.c:125 #25 0x004bad33 in relation_open (relationId=2662, lockmode=1) at heapam.c:1124 #26 0x004d888f in index_open (relationId=2662, lockmode=1) at indexam.c:150 #27 0x004d7f8d in systable_beginscan (heapRelation=0x15ed518, indexId=2662, indexOK=1 '\001', snapshot=0xe45920 , nkeys=1, key=0x7fff20f92e20) at genam.c:334 #28 0x0094c104 in ScanPgRelation (targetRelId=16387, indexOK=1 '\001', force_non_historic=0 '\000') at relcache.c:342 #29 0x0094d569 in RelationBuildDesc (targetRelId=16387, insertIt=0 '\000') at relcache.c:949 #30 0x0094fe4c in RelationClearRelation (relation=0x778bc8bdb390, rebuild=1 '\001') at relcache.c:2281 #31 0x0095023d in RelationFlushRelation (relation=0x778bc8bdb390) at relcache.c:2403 #32 0x00950351 in RelationCacheInvalidateEntry (relationId=16387) at relcache.c:2455 #33 0x0094865e in LocalExecuteInvalidationMessage (msg=0x7fff20f93190) at inval.c:578 #34 0x007eb057 in ReceiveSharedInvalidMessages (invalFunction=0x94856d ,
Re: FailedAssertion on partprune
On 9 August 2018 at 15:33, Tom Lane wrote: > Nonetheless, it's a damfool query plan, because we'll be going through > parallel worker startup/shutdown overhead 4 times for no benefit. > We really should put in some kind of logic to force those sibling > Gathers to be aggregated into one, or else disallow them altogether. I started debugging this to see where things go wrong. I discovered that add_paths_to_append_rel() is called yet again from apply_scanjoin_target_to_paths() and this is where it's all going wrong. The problem is that the gather paths have been tagged onto the partial paths by this time, so accumulate_append_subpath() has no code to look through those to fetch the Append/MergeAppend subpaths, so it just appends the entire path to the subpaths List. This all worked before 96030f9a481. This commit moved where generate_gather_paths() is called. I'm having a hard time understanding why we need to call add_paths_to_append_rel() from apply_scanjoin_target_to_paths(). The name of the function does not seem to imply that we'd do this. The comment just explains "what" rather than "why" making it a bit useless. /* Build new paths for this relation by appending child paths. */ if (live_children != NIL) add_paths_to_append_rel(root, rel, live_children); I also think that accumulate_append_subpath() is a bit of a fragile way of flattening the append rel's paths, but I feel it's probably apply_scanjoin_target_to_paths() that's at fault here. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
On 08/11/2018 05:02 AM, Tom Lane wrote: > Peter Geoghegan writes: >> I'm concerned that this item has the potential to delay the release, >> since, as you said, we're back to the drawing board. > > Me too. I will absolutely not vote to release 11.0 before we've > solved this ... > Not sure. I can easily reproduce that on REL_10_STABLE, so it's not a new issue specific to 11.0. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 08/11/2018 09:46 AM, Andres Freund wrote: > On 2018-08-11 00:34:15 -0700, Andres Freund wrote: >> I've run this numerous times now, and I haven't triggered it yet :/ > > Heh, I hit it literally seconds after hitting send: > > 2018-08-11 00:35:52.804 PDT [2196][9/14864] LOG: automatic analyze of table > "postgres.pg_catalog.pg_depend" system usage: CPU: user: 0.02 s, system: 0.00 > s, elapsed: 0.15 s > 2018-08-11 00:36:00.391 PDT [2196][9/14866] LOG: automatic analyze of table > "postgres.public.t" system usage: CPU: user: 0.25 s, system: 0.02 s, elapsed: > 7.56 s > 2018-08-11 00:36:00.399 PDT [2171][4/200412] PANIC: could not read block 3 > in file "base/12397/167946": read only 0 of 8192 bytes > 2018-08-11 00:36:00.399 PDT [2171][4/200412] STATEMENT: insert into t (b,c) > values (1,1); > 2018-08-11 00:36:00.410 PDT [2196][9/14868] LOG: skipping vacuum of > "pg_class" --- lock not available > 2018-08-11 00:36:00.448 PDT [389][] LOG: server process (PID 2171) was > terminated by signal 6: Aborted > 2018-08-11 00:36:00.448 PDT [389][] DETAIL: Failed process was running: > insert into t (b,c) values (1,1); > 2018-08-11 00:36:00.448 PDT [389][] LOG: terminating any other active server > processes > > Below you can find the bt full showing a bunch of nested invalidations. > Looking. > Hmmm, it's difficult to compare "bt full" output, but my backtraces look somewhat different (and all the backtraces I'm seeing are 100% exactly the same). Attached for comparison. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Core was generated by `postgres: user test [local] INSERT '. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 } (gdb) bt full #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 set = {__val = {0 , 140723895972192, 10983728, 0, 1024}} pid = tid = #1 0x74355af74800 in __GI_abort () at abort.c:89 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 , 24861680, 24859152, 140723895972608, 127772512011520, 127772511994880}}, sa_flags = 1526390557, sa_restorer = 0x0} sigs = {__val = {32, 0 }} #2 0x00a1d728 in errfinish (dummy=0) at elog.c:557 edata = 0xfc2e80 elevel = 22 oldcontext = 0x1873fc0 econtext = 0x0 __func__ = "errfinish" #3 0x008ae18c in mdread (reln=0x184bc30, forknum=MAIN_FORKNUM, blocknum=3, buffer=0x74354ce15c00 "") at md.c:786 seekpos = 24576 nbytes = 0 v = 0x17dbbe8 __func__ = "mdread" #4 0x008b0abb in smgrread (reln=0x184bc30, forknum=MAIN_FORKNUM, blocknum=3, buffer=0x74354ce15c00 "") at smgr.c:628 No locals. #5 0x00870ea6 in ReadBuffer_common (smgr=0x184bc30, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=3, mode=RBM_NORMAL, strategy=0x0, hit=0x7ffcd5d4b5eb) at bufmgr.c:890 io_start = {tv_sec = 140723895973204, tv_nsec = 127772266994688} io_time = {tv_sec = 7882407936, tv_nsec = 2305843014776551140} bufHdr = 0x74354b931d80 bufBlock = 0x74354ce15c00 found = false isExtend = false isLocalBuf = false __func__ = "ReadBuffer_common" #6 0x008707c9 in ReadBufferExtended (reln=0x74355be49850, forkNum=MAIN_FORKNUM, blockNum=3, mode=RBM_NORMAL, strategy=0x0) at bufmgr.c:664 hit = false buf = 32764 __func__ = "ReadBufferExtended" #7 0x008706a2 in ReadBuffer (reln=0x74355be49850, blockNum=3) at bufmgr.c:596 No locals. #8 0x00500682 in _bt_getbuf (rel=0x74355be49850, blkno=3, access=1) at nbtpage.c:736 buf = 29749 __func__ = "_bt_getbuf" #9 0x004ff413 in _bt_getroot (rel=0x74355be49850, access=1) at nbtpage.c:282 metabuf = 0 metapg = 0x50c004 <_bt_preprocess_keys+320> "\351\365\004" metaopaque = 0x7ffcd5d4b7a0 rootbuf = 0 rootpage = 0xd5d4bde0 rootopaque = 0x1874100 rootblkno = 3 rootlevel = 1 metad = 0x18534c8 __func__ = "_bt_getroot" #10 0x00506b0f in _bt_search (rel=0x74355be49850, keysz=1, scankey=0x7ffcd5d4c0d0, nextkey=false, bufP=0x7ffcd5d4cadc, access=1, snapshot=0xf79ba0 ) at nbtsearch.c:104 stack_in = 0x0 page_access = 1 #11 0x00508280 in _bt_first (scan=0x1874100, dir=ForwardScanDirection) at nbtsearch.c:1060 rel = 0x74355be49850 so = 0x187a030 buf = 0 stack = 0x7ffcd5d4cb20 offnum = 0 strat = 3 nextkey = false goback = false startKeys = {0x1874308, 0x10, 0x2, 0x0, 0x7c006b, 0x700077, 0x17e8c68, 0x17d6918, 0x17e85f0, 0xa5f2ceeaea997a00, 0x48, 0x74355b30bae0 , 0x20d8, 0x476c10 <_start>,
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 08/11/2018 09:34 AM, Andres Freund wrote: > Hi, > > On 2018-08-11 01:55:43 +0200, Tomas Vondra wrote: >> On 08/10/2018 11:59 PM, Tomas Vondra wrote: >>> >>> ... >>> >>> I suspect there's some other ingredient, e.g. some manipulation with the >>> subscription. Or maybe it's not needed at all and I'm just imagining things. >>> >> >> Indeed, the manipulation with the subscription seems to be the key here. >> I pretty reliably get the 'could not read block' error when doing this: >> >> 1) start the insert pgbench >> >>pgbench -n -c 4 -T 300 -p 5433 -f insert.sql test >> >> 2) start the vacuum full pgbench >> >>pgbench -n -f vacuum.sql -T 300 -p 5433 test > > Just to be clear: 5433 is the subscriber, right? I tried it with both, > but it's not clear what you meant with either the previous or the this > email. > Sorry for the confusion, 5433 was the publisher - I reinitialized the cluster sometime after the initial report, and happened to swap the port numbers. That is, both pgbench commands run on the publisher, while the subscriber tries to sync the table (and fails due to duplicate values). I can reproduce it pretty reliably, although it may take a couple of sync tries from the subscriber. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: libpq should not look up all host addresses at once
* when the password is required, there is no way to know for which host/ip it is: sh> psql "host=127.0.0.17,local2.coelho.net,local.coelho.net" Password for user fabien: In the same vein on a wrong password: sh> psql "host=no-such-host,local2.coelho.net,local3.coelho.net" Password for user fabien: psql: could not translate host name "no-such-host" to address: Name or service not known could not connect to server: Connection refused Is the server running on host "local2.coelho.net" (127.0.0.2) and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host "local2.coelho.net" (127.0.0.3) and accepting TCP/IP connections on port 5432? FATAL: password authentication failed for user "fabien" The Fatal error does not really say for which host/ip the password fail. Basically, with multiple hostnames and ips per hostname, error messages need to be more precise. -- Fabien.
Re: libpq should not look up all host addresses at once
Hello Tom, [...] So I think what this code should do is (1) look up each hostname as it needs it, not all at once, and (2) proceed on to the next hostname if it gets a DNS lookup failure, not fail the whole connection attempt immediately. As attached. A quick test, and very quick glance at the code. "git apply" gives "error: patch with only garbage at line 3". Patch applies with "patch -p1". Patch compiles, global "make check" ok, although I'm unsure whether the feature is actually tested somewhere. I think not:-( As you noted in another message, a small doc update should be needed. Patch works as expected: I tried with failing dns queries, multiple ips some of which or all failing connection attempts... About the behavior from psql point of view: * if dns works, error messages are only printed if all attempts failed: sh> ./psql "host=127.0.0.17,local2.coelho.net" psql: could not connect to server: Connection refused Is the server running on host "127.0.0.17" and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host "local2.coelho.net" (127.0.0.3) and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host "local2.coelho.net" (127.0.0.2) and accepting TCP/IP connections on port 5432? But nothing shows if one succeeds at some point. I understand that libpq is doing its job, but I'm wondering whether this is the best behavior. Maybe the user would like to know that attempts are made and are failing? This would suggest some callback mecanism so that the client is informed of in progress issues? Maybe this is for future work. Maybe psql should show these as warnings? * when the password is required, there is no way to know for which host/ip it is: sh> psql "host=127.0.0.17,local2.coelho.net,local.coelho.net" Password for user fabien: * once connected, \conninfo shows only a partial information: psql> \conninfo You are connected to database "postgres" as user "fabien" on host "local3.coelho.net" at port "5432". But local3 is 127.0.0.4 and 127.0.0.1, which one is it? * Code atoi("5432+1") == 5432, so the port syntax check is loose, really. I'd consider wrapping some of the logic. I'd check the port first, then move the host resolution stuff into a function. I would be fine with backpatching, as the current behavior is not somehow broken. -- Fabien.
Re: BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack
Hi Nathan, On Fri, Jul 27, 2018 at 02:40:42PM +, Bossart, Nathan wrote: > I think I'm essentially suggesting what you have in 0002 but without > the new RangeVarGetRelidExtended() callback. I've attached a modified > version of 0002 that seems to fix the originally reported issue. (I > haven't looked into any extra handling needed for ANALYZE or > partitioned tables.) Running the same checks for all VACUUMs would > keep things simple and provide a more uniform user experience. I have been looking at this patch for VACUUM (perhaps we ought to spawn a new thread as the cases of REINDEX and TRUNCATE have been addressed), and I can see a problem with this approach. If multiple transactions are used with a list of relations vacuum'ed then simply moving around the ACL checks would cause a new problem: if the relation vacuum'ed disappears when processing it with vacuum_rel() after the list of relations is built then we would get an ERROR instead of a skip because of pg_class_ownercheck() which is not fail-safe. Wouldn't it be enough to check just the ACLs in expand_vacuum_rel() and get_all_vacuum_rels()? The first is rather similar to what happens for TRUNCATE, and the second is similar to what has been fixed for VACUUM. We should also remove the relkind checks out of vacuum_skip_rel() as those checks are not completely consistent for all those code paths... -- Michael signature.asc Description: PGP signature
pgbench - doCustom cleanup
Hello pgdev, This patch rework & clarifies pgbench internal state machine. It was indirectly triggered by Heikki review of pgbench tap tests (https://commitfest.postgresql.org/19/1306/), and by Marina's patch about tx retry on some errors (https://commitfest.postgresql.org/19/1645/) which I am reviewing. - it adds more comments to the enum state definitions and to doCustom. - there is some code cleanup/simplifications: . a few useless intermediate variables are removed, . a macro is added to avoid a repeated pattern to set the current time, . performance data are always collected instead of trying to be clever and not collect some data in some cases. - more fundamentally, all state changes are performed within doCustom, prior that there was one performed by threadRun which made undertanding the end of run harder. Now threadRun only look at the current state to make decisions about a client. - doCustom is made to always return at the end of a script to avoid an infinite loop on backslash-command only script, instead of hack with a variable to detect loops, which made it return every two script runs in such cases. - there is a small behavioral change: prior to the patch, a script would always run to its end once started, with the exception of \sleep commands which could be interrupted by threadRun. Marina's patch should enforce somehow one script = one transaction so that a retry makes sense, so this exception is removed to avoid aborting a tx implicitely. -- Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 41b756c089..369e321196 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -267,14 +267,22 @@ typedef enum * transaction, and advance to CSTATE_THROTTLE. CSTATE_THROTTLE state * sleeps until that moment. (If throttling is not enabled, doCustom() * falls directly through from CSTATE_START_THROTTLE to CSTATE_START_TX.) +* +* It may also detect that the next transaction would start beyond the end +* of run, and switch to CSTATE_FINISHED. */ CSTATE_START_THROTTLE, CSTATE_THROTTLE, /* * CSTATE_START_TX performs start-of-transaction processing. Establishes -* a new connection for the transaction, in --connect mode, and records -* the transaction start time. +* a new connection for the transaction in --connect mode, records +* the transaction start time, and proceed to the first command. +* +* Note: once a script is started, it will either error or run till +* its end, where it may be interrupted. It is not interrupted while +* running, so pgbench --time is to be understood as tx are allowed to +* start in that time, and will finish when their work is completed. */ CSTATE_START_TX, @@ -287,9 +295,6 @@ typedef enum * and we enter the CSTATE_SLEEP state to wait for it to expire. Other * meta-commands are executed immediately. * -* CSTATE_SKIP_COMMAND for conditional branches which are not executed, -* quickly skip commands that do not need any evaluation. -* * CSTATE_WAIT_RESULT waits until we get a result set back from the server * for the current command. * @@ -297,19 +302,25 @@ typedef enum * * CSTATE_END_COMMAND records the end-of-command timestamp, increments the * command counter, and loops back to CSTATE_START_COMMAND state. +* +* CSTATE_SKIP_COMMAND is used by conditional branches which are not +* executed. It quickly skip commands that do not need any evaluation. +* This state can move forward several commands, till there is something +* to do or the end of the script. */ CSTATE_START_COMMAND, - CSTATE_SKIP_COMMAND, CSTATE_WAIT_RESULT, CSTATE_SLEEP, CSTATE_END_COMMAND, + CSTATE_SKIP_COMMAND, /* -* CSTATE_END_TX performs end-of-transaction processing. Calculates -* latency, and logs the transaction. In --connect mode, closes the -* current connection. Chooses the next script to execute and starts over -* in CSTATE_START_THROTTLE state, or enters CSTATE_FINISHED if we have no -* more work to do. +* CSTATE_END_TX performs end-of-transaction processing. It calculates +* latency, and logs the transaction. In --connect mode, it closes the +* current connection. +* +* Then either starts over in CSTATE_CHOOSE_SCRIPT, or enters CSTATE_FINISHED +* if we have no more work to do. */ CSTATE_END_TX, @@ -2679,16 +2690,13 @@ evaluateSleep(CState *st, int argc, char **argv, int *usecs) /* * Advance the state machine of a connection, if possible. + * + * All state changes are performed within this function
Re: logical decoding / rewrite map vs. maxAllocatedDescs
Hi, On 2018-08-11 00:46:25 -0700, Andres Freund wrote: > Below you can find the bt full showing a bunch of nested invalidations. > Looking. Hm. I wonder if the attached fixes the issue for you. If it's this I don't understand why this doesn't occur on older branches... I've not yet been able to reliably reproduce the issue without the patch, so I'm not sure it means much that it didn't reoccur afterwards. - Andres diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 1d43a165ad0..3b6d6047608 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -3278,6 +3278,8 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname) new_ent->combocid = ent->combocid; } } + + CloseTransientFile(fd); } diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index f4374d077be..ae33302ecf1 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -767,7 +767,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, if (nbytes != BLCKSZ) { if (nbytes < 0) - ereport(ERROR, + ereport(PANIC, (errcode_for_file_access(), errmsg("could not read block %u in file \"%s\": %m", blocknum, FilePathName(v->mdfd_vfd; @@ -783,7 +783,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, if (zero_damaged_pages || InRecovery) MemSet(buffer, 0, BLCKSZ); else - ereport(ERROR, + ereport(PANIC, (errcode(ERRCODE_DATA_CORRUPTED), errmsg("could not read block %u in file \"%s\": read only %d of %d bytes", blocknum, FilePathName(v->mdfd_vfd), diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 6125421d39a..8737f77a24c 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -2206,7 +2206,8 @@ RelationReloadNailed(Relation relation) relation->rd_isvalid = true; pg_class_tuple = ScanPgRelation(RelationGetRelid(relation), - true, false); + RelationGetRelid(relation) != ClassOidIndexId, + false); relp = (Form_pg_class) GETSTRUCT(pg_class_tuple); memcpy(relation->rd_rel, relp, CLASS_TUPLE_SIZE); heap_freetuple(pg_class_tuple);
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 2018-08-11 00:34:15 -0700, Andres Freund wrote: > I've run this numerous times now, and I haven't triggered it yet :/ Heh, I hit it literally seconds after hitting send: 2018-08-11 00:35:52.804 PDT [2196][9/14864] LOG: automatic analyze of table "postgres.pg_catalog.pg_depend" system usage: CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.15 s 2018-08-11 00:36:00.391 PDT [2196][9/14866] LOG: automatic analyze of table "postgres.public.t" system usage: CPU: user: 0.25 s, system: 0.02 s, elapsed: 7.56 s 2018-08-11 00:36:00.399 PDT [2171][4/200412] PANIC: could not read block 3 in file "base/12397/167946": read only 0 of 8192 bytes 2018-08-11 00:36:00.399 PDT [2171][4/200412] STATEMENT: insert into t (b,c) values (1,1); 2018-08-11 00:36:00.410 PDT [2196][9/14868] LOG: skipping vacuum of "pg_class" --- lock not available 2018-08-11 00:36:00.448 PDT [389][] LOG: server process (PID 2171) was terminated by signal 6: Aborted 2018-08-11 00:36:00.448 PDT [389][] DETAIL: Failed process was running: insert into t (b,c) values (1,1); 2018-08-11 00:36:00.448 PDT [389][] LOG: terminating any other active server processes Below you can find the bt full showing a bunch of nested invalidations. Looking. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 set = {__val = {0, 14583431671641719254, 0, 0, 0, 0, 0, 0, 0, 0, 1296236544, 94518929497176, 1024, 0, 0, 0}} pid = tid = ret = #1 0x7f0de57f52f1 in __GI_abort () at abort.c:79 save_stage = 1 act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0, 0, 0, 17399330032, 94518929499360, 94518929499336, 94518929496896, 140720527947024, 94518900188034, 69, 94518929499360, 4870, 94518929496896, 94518929499360, 139697663207040, 139697663189664}}, sa_flags = -14227, sa_restorer = 0x0} sigs = {__val = {32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}} #2 0x55f6e5748e0b in errfinish (dummy=0) at /home/andres/src/postgresql/src/backend/utils/error/elog.c:557 edata = 0x55f6e5af6dc0 elevel = 22 oldcontext = 0x55f6e743c570 econtext = 0x0 __func__ = "errfinish" #3 0x55f6e55d2c0c in mdread (reln=0x55f6e7415de0, forknum=MAIN_FORKNUM, blocknum=3, buffer=0x7f0de4eb5700 "\001") at /home/andres/src/postgresql/src/backend/storage/smgr/md.c:786 seekpos = 24576 nbytes = 0 v = 0x55f6e73a1448 __func__ = "mdread" #4 0x55f6e55d5554 in smgrread (reln=0x55f6e7415de0, forknum=MAIN_FORKNUM, blocknum=3, buffer=0x7f0de4eb5700 "\001") at /home/andres/src/postgresql/src/backend/storage/smgr/smgr.c:628 No locals. #5 0x55f6e5593bea in ReadBuffer_common (smgr=0x55f6e7415de0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=3, mode=RBM_NORMAL, strategy=0x0, hit=0x7ffc0d14b7fb) at /home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:890 io_start = {tv_sec = 140720527947552, tv_nsec = 94518898352892} io_time = {tv_sec = 72057594257389344, tv_nsec = 72197288386101248} bufHdr = 0x7f0de3eb76c0 bufBlock = 0x7f0de4eb5700 found = false isExtend = false isLocalBuf = false __func__ = "ReadBuffer_common" #6 0x55f6e55934fb in ReadBufferExtended (reln=0x7f0de627a850, forkNum=MAIN_FORKNUM, blockNum=3, mode=RBM_NORMAL, strategy=0x0) at /home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:664 hit = false buf = 1980 __func__ = "ReadBufferExtended" #7 0x55f6e55933cf in ReadBuffer (reln=0x7f0de627a850, blockNum=3) at /home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:596 No locals. #8 0x55f6e520a530 in _bt_getbuf (rel=0x7f0de627a850, blkno=3, access=1) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtpage.c:736 buf = 22006 __func__ = "_bt_getbuf" #9 0x55f6e52091cf in _bt_getroot (rel=0x7f0de627a850, access=1) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtpage.c:282 metabuf = 0 metapg = 0x55f6e5216574 <_bt_preprocess_keys+326> "\351\371\004" metaopaque = 0x7ffc0d14b9b0 rootbuf = 22006 rootpage = 0x7ffc0d14b954 "" rootopaque = 0x55f6e743c6b0 rootblkno = 3 rootlevel = 1 metad = 0x55f6e741da88 __func__ = "_bt_getroot" #10 0x55f6e5210ea0 in _bt_search (rel=0x7f0de627a850, keysz=1, scankey=0x7ffc0d14c2e0, nextkey=false, bufP=0x7ffc0d14cce4, access=1, snapshot=0x55f6e5aade80 ) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtsearch.c:104 stack_in = 0x0 page_access = 1 #11 0x55f6e521263d in _bt_first (scan=0x55f6e743c6b0, dir=ForwardScanDirection) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtsearch.c:1060 rel = 0x7f0de627a850 so = 0x55f6e74405d0 buf = 0 stack = 0x7ffc0d14cd20 offnum = 0 strat = 3 nextkey = false
Re: logical decoding / rewrite map vs. maxAllocatedDescs
Hi, On 2018-08-11 01:55:43 +0200, Tomas Vondra wrote: > On 08/10/2018 11:59 PM, Tomas Vondra wrote: > > > > ... > > > > I suspect there's some other ingredient, e.g. some manipulation with the > > subscription. Or maybe it's not needed at all and I'm just imagining things. > > > > Indeed, the manipulation with the subscription seems to be the key here. > I pretty reliably get the 'could not read block' error when doing this: > > 1) start the insert pgbench > >pgbench -n -c 4 -T 300 -p 5433 -f insert.sql test > > 2) start the vacuum full pgbench > >pgbench -n -f vacuum.sql -T 300 -p 5433 test Just to be clear: 5433 is the subscriber, right? I tried it with both, but it's not clear what you meant with either the previous or the this email. > 3) try to create a subscription, but with small amount of conflicting > data so that the sync fails like this: > > LOG: logical replication table synchronization worker for > subscription "s", table "t" has started > ERROR: duplicate key value violates unique constraint "t_pkey" > DETAIL: Key (a)=(5997542) already exists. > CONTEXT: COPY t, line 1 > LOG: worker process: logical replication worker for subscription > 16458 sync 16397 (PID 31983) exited with exit code 1 > > 4) At this point the insert pgbench (at least some clients) should have > failed with the error. If not, rinse and repeat. I've run this numerous times now, and I haven't triggered it yet :/ Greetings, Andres Freund
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On Sat, Aug 11, 2018 at 12:06 AM, Andres Freund wrote: > To the point that I wonder if we shouldn't just change the ERROR into a > PANIC on master (but not REL_11_STABLE), so the buildfarm gives us > feedback. I don't think the problem can fundamentally be related to > subscriptions, given the error occurs before any subscriptions are > created in the schedule. +1. I like that idea. -- Peter Geoghegan
Re: logical decoding / rewrite map vs. maxAllocatedDescs
Hi, On 2018-08-11 01:55:43 +0200, Tomas Vondra wrote: > On 08/10/2018 11:59 PM, Tomas Vondra wrote: > > > > ... > > > > I suspect there's some other ingredient, e.g. some manipulation with the > > subscription. Or maybe it's not needed at all and I'm just imagining things. > > > > Indeed, the manipulation with the subscription seems to be the key here. > I pretty reliably get the 'could not read block' error when doing this: > > 1) start the insert pgbench > >pgbench -n -c 4 -T 300 -p 5433 -f insert.sql test > > 2) start the vacuum full pgbench > >pgbench -n -f vacuum.sql -T 300 -p 5433 test > > 3) try to create a subscription, but with small amount of conflicting > data so that the sync fails like this: > > LOG: logical replication table synchronization worker for > subscription "s", table "t" has started > ERROR: duplicate key value violates unique constraint "t_pkey" > DETAIL: Key (a)=(5997542) already exists. > CONTEXT: COPY t, line 1 > LOG: worker process: logical replication worker for subscription > 16458 sync 16397 (PID 31983) exited with exit code 1 > > 4) At this point the insert pgbench (at least some clients) should have > failed with the error. If not, rinse and repeat. > > This kinda explains why I've been seeing the error only occasionally, > because it only happened when I forgotten to clean the table on the > subscriber while recreating the subscription. I'll try to reproduce this. If you're also looking, I suspect a good first hint would be to just change the ERROR into a PANIC and look at the backtrace from the generated core file. To the point that I wonder if we shouldn't just change the ERROR into a PANIC on master (but not REL_11_STABLE), so the buildfarm gives us feedback. I don't think the problem can fundamentally be related to subscriptions, given the error occurs before any subscriptions are created in the schedule. Greetings, Andres Freund
Re: csv format for psql
2018-08-10 12:25 GMT+02:00 Daniel Verite : > Pavel Stehule wrote: > > > > On the whole I'm inclined to resubmit the patch with > > > fieldsep_csv and some minor changes based on the rest > > > of the discussion. > > > > > > > +1 > > PFA an updated version. > Usage from the command line: > $ psql --csv # or -P format=csv > $ psql --csv -P fieldsep_csv=";" # for non-comma csv separator > > From inside psql: > > \pset format csv > \pset fieldsep_csv '\t' > > quick check +1 - I have not a objections Regards Pavel > > Best regards, > -- > Daniel Vérité > PostgreSQL-powered mailer: http://www.manitou-mail.org > Twitter: @DanielVerite >