Re: Proposal: SLRU to Buffer Cache
Hi Shawn, On Wed, Aug 15, 2018 at 9:35 AM, Shawn Debnath wrote: > At the Unconference in Ottawa this year, I pitched the idea of moving > components off of SLRU and on to the buffer cache. The motivation > behind the idea was three fold: > > * Improve performance by eliminating fixed sized caches, simplistic > scan and eviction algorithms. > * Ensuring durability and consistency by tracking LSNs and checksums > per block. > * Consolidating caching strategies in the engine to simplify the > codebase, and would benefit from future buffer cache optimizations. Thanks for working on this. These are good goals, and I've wondered about doing exactly this myself for exactly those reasons. I'm sure we're not the only ones, and I heard only positive reactions to your unconference pitch. As you know, my undo log storage design interacts with the buffer manager in the same way, so I'm interested in this subject and will be keen to review and test what you come up with. That said, I'm fairly new here myself and there are people on this list with a decade or two more experience hacking on the buffer manager and transam machinery. > As the changes are quite invasive, I wanted to vet the approach with the > community before digging in to implementation. The changes are strictly > on the storage side and do not change the runtime behavior or protocols. > Here's the current approach I am considering: > > 1. Implement a generic block storage manager that parameterizes > several options like segment sizes, fork and segment naming and > path schemes, concepts entrenched in md.c that are strongly tied to > relations. To mitigate risk, I am planning on not modifying md.c > for the time being. +1 for doing it separately at first. I've also vacillated between extending md.c and doing my own undo_file.c thing. It seems plausible that between SLRU and undo we could at least share a common smgr implementation, and eventually maybe md.c. There are a few differences though, and the question is whether we'd want to do yet another abstraction layer with callbacks/vtable/configuration points to handle that parameterisation, or just use the existing indirection in smgr and call it good. I'm keen to see what you come up with. After we have a patch to refactor and generalise the fsync stuff from md.c (about which more below), let's see what is left and whether we can usefully combine some code. > 2. Introduce a new smgr_truncate_extended() API to allow truncation of > a range of blocks starting at a specific offset, and option to > delete the file instead of simply truncating. Hmm. In my undo proposal I'm currently implementing only the minimum smgr interface required to make bufmgr.c happy (basically read and write blocks), but I'm managing segment files (creating, deleting, recycling) directly via a separate interface UndoLogAllocate(), UndoLogDiscard() defined in undolog.c. That seemed necessary for me because that's where I had machinery to track the meta-data (mostly head and tail pointers) for each undo log explicitly, but I suppose I could use a wider smgr interface as you are proposing to move the filesystem operations over there. Perhaps I should reconsider that split. I look forward to seeing your code. > 3. I will continue to use the RelFileNode/SMgrRelation constructs > through the SMgr API. I will reserve OIDs within the engine that we > can use as DB ID in RelFileNode to determine which storage manager > to associate for a specific SMgrRelation. To increase the > visibility of the OID mappings to the user, I would expose a new > catalog where the OIDs can be reserved and mapped to existing > components for template db generation. Internally, SMgr wouldn't > rely on catalogs, but instead will have them defined in code to not > block bootstrap. This scheme should be compatible with the undo log > storage work by Thomas Munro, et al. [0]. +1 for the pseudo-DB OID scheme, for now. I think we can reconsider how we want to structure buffer tags in the longer term as part of future projects that overhaul buffer mapping. We shouldn't get hung up on that now. I was wondering what the point of exposing the OIDs to users in a catalog would be though. It's not necessary to do that to reserve them (and even if it were, pg_database would be the place): the OIDs we choose for undo, clog, ... just have to be in the system reserved range to be safe from collisions. I suppose one benefit would be the ability to join eg pg_buffer_cache against it to get a human readable name like "clog", but that'd be slightly odd because the DB OID field would refer to entries in pg_database or pg_storage_manager depending on the number range. > 4. For each component that will be transitioned over to the generic > block storage, I will introduce a page header at the beginning of > the block and re-work the associated offset ca
Add a semicolon to query related to search_path
Hi, I found some improvements in Client Applications in /src/bin/scripts when I resumed development of progress monitor for cluster command. Attached patch gives the following query a semicolon for readability. s/SELECT pg_catalog.set_config ('search_path', '', false)/ SELECT pg_catalog.set_config ('search_path', '', false);/ s/RESET search_path/RESET search_path;/ For example, Client application vacuumdb's results using the patch are following: # Not patched # $ vacuumdb -e -Zt 'pg_am(amname)' SELECT pg_catalog.set_config ('search_path', '', false) vacuumdb: vacuuming database "postgres" RESET search_path SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns WHERE c.relnamespace OPERATOR (pg_catalog. =) Ns.oid AND c.oid OPERATOR (pg_catalog. =) 'Pg_am' :: pg_catalog.regclass; SELECT pg_catalog.set_config ('search_path', '', false) ANALYZE pg_catalog.pg_am (amname); # Patched # $ vacuumdb -e -Zt 'pg_am(amname)' SELECT pg_catalog.set_config ('search_path', '', false); vacuumdb: vacuuming database "postgres" RESET search_path; SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns WHERE c.relnamespace OPERATOR (pg_catalog. =) Ns.oid AND c.oid OPERATOR (pg_catalog. =) 'Pg_am' :: pg_catalog.regclass; SELECT pg_catalog.set_config ('search_path', '', false); ANALYZE pg_catalog.pg_am (amname); I tested "make check-world" and "make installcheck-world" on 777e6ddf1 and are fine. Regards, Tatsuro Yamada NTT Open Source Software Center diff --git a/src/bin/scripts/common.c b/src/bin/scripts/common.c index db2b9f0d68..a80089ccde 100644 --- a/src/bin/scripts/common.c +++ b/src/bin/scripts/common.c @@ -335,7 +335,7 @@ appendQualifiedRelation(PQExpBuffer buf, const char *spec, appendStringLiteralConn(&sql, table, conn); appendPQExpBufferStr(&sql, "::pg_catalog.regclass;"); - executeCommand(conn, "RESET search_path", progname, echo); + executeCommand(conn, "RESET search_path;", progname, echo); /* * One row is a typical result, as is a nonexistent relation ERROR. diff --git a/src/include/fe_utils/connect.h b/src/include/fe_utils/connect.h index fa293d2458..d62f5a3724 100644 --- a/src/include/fe_utils/connect.h +++ b/src/include/fe_utils/connect.h @@ -23,6 +23,6 @@ * might work with the old server, skip this. */ #define ALWAYS_SECURE_SEARCH_PATH_SQL \ - "SELECT pg_catalog.set_config('search_path', '', false)" + "SELECT pg_catalog.set_config('search_path', '', false);" #endif /* CONNECT_H */
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
On 2018/08/15 12:25, Etsuro Fujita wrote: > (2018/08/15 0:51), Robert Haas wrote: >> On Mon, Aug 13, 2018 at 12:32 PM, Etsuro Fujita >> wrote: >>> One thing I noticed might be an improvement is to skip >>> build_joinrel_partition_info if the given joinrel will be to have >>> consider_partitionwise_join=false; in the previous patch, that function >>> created the joinrel's partition info such as part_scheme and part_rels if >>> the joinrel is considered as partitioned, independently of the flag >>> consider_partitionwise_join for it, but if that flag is false, we don't >>> generate PWJ paths for the joinrel, so we would not need to create that >>> partition info at all. This would not only avoid unnecessary >>> processing in >>> that function, but also make unnecessary the changes I made to >>> try_partitionwise_join, generate_partitionwise_join_paths, >>> apply_scanjoin_target_to_paths, and create_ordinary_grouping_paths. So I >>> updated the patch that way. Please find attached an updated version of >>> the >>> patch. >> >> I guess the question is whether there are (or might be in the future) >> other dependencies on part_scheme. For example, it looks like >> partition pruning uses it. I'm not sure whether partition pruning >> supports a plan like: >> >> Append >> -> Nested Loop >> -> Seq Scan on p1 >> -> Index Scan on q1 >> > > I'm not sure that either, but if a join relation doesn't have part_scheme > set, it means that that relation is considered as non-partitioned, as in > the case when enable_partitionwise_join is off, so there would be no PWJ > paths generated for it, to begin with. So in that case, ISTM that we > don't need to worry about that at least for partition pruning. Fwiw, partition pruning works only for base rels, which applies to both planning-time pruning (pruning is performed only during base rel size estimation) and run-time pruning (we'll add pruning info to the Append plan only if the source AppendPath's parent rel is a base rel). Thanks, Amit
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
(2018/08/15 0:51), Robert Haas wrote: On Mon, Aug 13, 2018 at 12:32 PM, Etsuro Fujita wrote: One thing I noticed might be an improvement is to skip build_joinrel_partition_info if the given joinrel will be to have consider_partitionwise_join=false; in the previous patch, that function created the joinrel's partition info such as part_scheme and part_rels if the joinrel is considered as partitioned, independently of the flag consider_partitionwise_join for it, but if that flag is false, we don't generate PWJ paths for the joinrel, so we would not need to create that partition info at all. This would not only avoid unnecessary processing in that function, but also make unnecessary the changes I made to try_partitionwise_join, generate_partitionwise_join_paths, apply_scanjoin_target_to_paths, and create_ordinary_grouping_paths. So I updated the patch that way. Please find attached an updated version of the patch. I guess the question is whether there are (or might be in the future) other dependencies on part_scheme. For example, it looks like partition pruning uses it. I'm not sure whether partition pruning supports a plan like: Append -> Nested Loop -> Seq Scan on p1 -> Index Scan on q1 I'm not sure that either, but if a join relation doesn't have part_scheme set, it means that that relation is considered as non-partitioned, as in the case when enable_partitionwise_join is off, so there would be no PWJ paths generated for it, to begin with. So in that case, ISTM that we don't need to worry about that at least for partition pruning. Best regards, Etsuro Fujita
Re: Facility for detecting insecure object naming
On Tue, Aug 14, 2018 at 04:42:43PM -0400, Bruce Momjian wrote: > On Tue, Aug 14, 2018 at 04:23:33PM -0400, Robert Haas wrote: > > On Tue, Aug 14, 2018 at 3:31 PM, Bruce Momjian wrote: > > > I am unclear how lexical scoping helps with this, or with Postgres. > > > > Well, if you say sqrt(x) in C, and you don't have sqrt or x defined in > > the current function, you know that you're going to call a global > > function called sqrt and pass to it a global variable called x. > > You're not going to latch onto a local variable called x in the > > function that called your function, and if the calling function has a > > local variable called sqrt, that doesn't matter either. The linker > > can point every called to sqrt() in the program at a different place > > than expected, but you can't monkey with the behavior of an individual > > call by having the calling function declare identifiers that mask the > > ones the function intended to reference. > > What we are doing is more like C++ virtual functions, where the class > calling it can replace function calls in subfunctions: > > https://www.geeksforgeeks.org/virtual-function-cpp/ > > > On the other hand, when you call an SQL-language function, or even to > > some extent a plpgsql-language function, from PostgreSQL, you can in > > fact change the meanings of every identifier that appears in that > > function body unless those references are all schema-qualified or the > > function sets the search path. If an SQL-language function calls > > I think we decide that search path alone for functions is insufficient > because of data type matching, unless the schema is secure. Right. For what it's worth, the example I permuted upthread might look like this in a lexical search path world: -- Always secure, even if schema usage does not conform to ddl-schemas-patterns -- and function trust is disabled or unavailable. -- -- At CREATE EXTENSION time only, subject to denial of service from anyone able -- to CREATE in cube schema or earthdistance schema. -- -- Objects in @cube_schema@ are qualified so objects existing in @extschema@ at -- CREATE EXTENSION time cannot mask them. CREATE FUNCTION latitude(earth) RETURNS float8 LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS $$SELECT CASE WHEN @cube_schema@.cube_ll_coord($1::@cube_schema@.cube, 3) / earth() < -1 THEN -90::float8 WHEN @cube_schema@.cube_ll_coord($1::@cube_schema@.cube, 3) / earth() > 1 THEN 90::float8 ELSE degrees(asin(@cube_schema@.cube_ll_coord( $1::@cube_schema@.cube, 3) / earth())) END$$;
Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3
Hi. Even if you apply "data-at-rest-encryption-wip-2018.07.25.patch" to the master branch, the same patch error will occur. (Patch error of pg_rewind was eliminated) patching file src/backend/replication/logical/reorderbuffer.c Hunk #9 FAILED at 2464. patching file src/bin/pg_upgrade/controldata.c Hunk #1 FAILED at 58. (See the attached file for the entire patch log) Antonin Houska wrote: > Toshi Harada wrote: > > > Hi. > > > > If "data-at-rest-encryption-wip-2018.07.25.patch" is applied to PostgreSQL > > 11-beta3 released last week, > > patch execution will fail as follows. > > > > > patching file src/backend/replication/logical/reorderbuffer.c > > Hunk #9 FAILED at 2464. > > > > 1 out of 7 hunks FAILED -- saving rejects to file > > src/bin/pg_rewind/pg_rewind.c.rej > > > > 1 out of 6 hunks FAILED -- saving rejects to file > > src/bin/pg_upgrade/controldata.c.rej > > The patch should be applicable to the master branch. > > -- > Antonin Houska > Cybertec Schonig & Schonig GmbH > Grohrmuhlgasse 26, A-2700 Wiener Neustadt > Web: https://www.cybertec-postgresql.com > master+patch.log Description: Binary data
Re: Get funcid when create function
Yes, I had read this document, BUT it's call from PL/PGSQL. I want call it from c function. 2018-08-13 18:57 GMT+08:00 Robert Haas : > On Fri, Aug 10, 2018 at 5:50 AM, 王翔宇 wrote: > > I'm developing a extension for pg. Now I have create a event trigger on > > ddl_command_end, and this function will be called after I enter create > > function statement. In this function I can only get parseTree. In pg > source > > code, I found a function named "pg_event_trigger_ddl_commands" seems > provide > > cmds which include funcid. BUT I didn't found any example to call this > > function. > > who can helps? > > Maybe it would help to read the documentation on that function: > > https://www.postgresql.org/docs/current/static/functions- > event-triggers.html > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: Postgres, fsync, and OSs (specifically linux)
On Wed, Aug 15, 2018 at 11:08 AM, Asim R P wrote: > I was looking at the commitfest entry for feature > (https://commitfest.postgresql.org/19/1639/) for the most recent list > of patches to try out. The list doesn't look correct/complete. Can > someone please check? Hi Asim, This thread is a bit tangled up. There are two related patchsets in it: 1. Craig Ringer's PANIC-on-EIO patch set, to cope with the fact that Linux throws away buffers and errors after reporting an error, so the checkpointer shouldn't retry as it does today. The latest is here: https://www.postgresql.org/message-id/CAMsr%2BYFPeKVaQ57PwHqmRNjPCPABsdbV%3DL85he2dVBcr6yS1mA%40mail.gmail.com 2. Andres Freund's fd-sending fsync queue, to cope with the fact that some versions of Linux only report writeback errors that occurred after you opened the file, and all versions of Linux and some other operating systems might forget about writeback errors while no one has it open. Here is the original patchset: https://www.postgresql.org/message-id/20180522010823.z5bdq7wnlsna5qoo%40alap3.anarazel.de Here is a fix-up you need: https://www.postgresql.org/message-id/20180522185951.5sdudzl46spktyyz%40alap3.anarazel.de Here are some more fix-up patches that I propose: https://www.postgresql.org/message-id/CAEepm%3D2WSPP03-20XHpxohSd2UyG_dvw5zWS1v7Eas8Rd%3D5e4A%40mail.gmail.com I will soon post some more fix-up patches that add EXEC_BACKEND support, Windows support, and a counting scheme to fix the timing issue that I mentioned in my first review. I will probably squash it all down to a tidy patch-set after that. -- Thomas Munro http://www.enterprisedb.com
C99 compliance for src/port/snprintf.c
I noticed that src/port/snprintf.c still follows the old (pre-C99) semantics for what to return from snprintf() after buffer overrun. This seems bad for a couple of reasons: * It results in potential performance loss in psnprintf and appendPQExpBufferVA, since those have to fly blind about how much to enlarge the buffer before retrying. * Given that we override the system-supplied *printf if it doesn't support the 'z' width modifier, it seems highly unlikely that we are still using any snprintf's with pre-C99 semantics, except when this code is used. So there's not much point in keeping this behavior as a test case to ensure compatibility with such libraries. * When we install snprintf.c, port.h forces it to be used everywhere that includes c.h, including extensions. It seems quite possible that there is extension code out there that assumes C99 snprintf behavior. Such code would work today everywhere except Windows, since that's the only platform made in this century that requires snprintf.c. Between the existing Windows porting hazard and our nearby discussion about using snprintf.c on more platforms, I'd say this is a gotcha waiting to bite somebody. Hence, PFA a patch that changes snprintf.c to follow C99 by counting dropped bytes in the result of snprintf(), including in the case where the passed count is zero. (I also dropped the tests for str == NULL when count isn't zero; that's not permitted by either C99 or SUSv2, so I see no reason for this code to support it. Also, avoid wasting one byte in the local buffer for *fprintf.) I'm almost tempted to think that the reasons above make this a back-patchable bug fix. Comments? regards, tom lane diff --git a/src/port/snprintf.c b/src/port/snprintf.c index a184134..851e2ae 100644 *** a/src/port/snprintf.c --- b/src/port/snprintf.c *** *** 2,7 --- 2,8 * Copyright (c) 1983, 1995, 1996 Eric P. Allman * Copyright (c) 1988, 1993 * The Regents of the University of California. All rights reserved. + * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions *** *** 49,56 * SNPRINTF, VSNPRINTF and friends * * These versions have been grabbed off the net. They have been ! * cleaned up to compile properly and support for most of the Single Unix ! * Specification has been added. Remaining unimplemented features are: * * 1. No locale support: the radix character is always '.' and the ' * (single quote) format flag is ignored. --- 50,57 * SNPRINTF, VSNPRINTF and friends * * These versions have been grabbed off the net. They have been ! * cleaned up to compile properly and support for most of the C99 ! * specification has been added. Remaining unimplemented features are: * * 1. No locale support: the radix character is always '.' and the ' * (single quote) format flag is ignored. *** *** 64,88 * 5. Space and '#' flags are not implemented. * * ! * The result values of these functions are not the same across different ! * platforms. This implementation is compatible with the Single Unix Spec: * ! * 1. -1 is returned only if processing is abandoned due to an invalid ! * parameter, such as incorrect format string. (Although not required by ! * the spec, this happens only when no characters have yet been transmitted ! * to the destination.) * ! * 2. For snprintf and sprintf, 0 is returned if str == NULL or count == 0; ! * no data has been stored. * ! * 3. Otherwise, the number of bytes actually transmitted to the destination ! * is returned (excluding the trailing '\0' for snprintf and sprintf). * ! * For snprintf with nonzero count, the result cannot be more than count-1 ! * (a trailing '\0' is always stored); it is not possible to distinguish ! * buffer overrun from exact fit. This is unlike some implementations that ! * return the number of bytes that would have been needed for the complete ! * result string. */ /** --- 65,88 * 5. Space and '#' flags are not implemented. * * ! * Historically the result values of sprintf/snprintf varied across platforms. ! * This implementation now follows the C99 standard: * ! * 1. -1 is returned if an error is detected in the format string, or if ! * a write to the target stream fails (as reported by fwrite). Note that ! * overrunning snprintf's target buffer is *not* an error. * ! * 2. For successful writes to streams, the actual number of bytes written ! * to the stream is returned. * ! * 3. For successful sprintf/snprintf, the number of bytes that would have ! * been written to an infinite-size buffer (excluding the trailing '\0') ! * is returned. snprintf will truncate its output to fi
Re: Facility for detecting insecure object naming
On 08/14/2018 07:01 PM, Nico Williams wrote: > On Tue, Aug 14, 2018 at 03:00:55PM +, Robert Haas wrote: >> The more I think about it, the more I think having a way to set a >> lexically-scoped search path is probably the answer. [...] > > Yes please! > > This is what I want. Evaluate the search_path at function definition > time, and record code with fully-qualified symbols in the catalog. What would the interface to that look like for out-of-tree PL maintainers? -Chap
Re: Tid scan improvements
On 12 August 2018 at 20:07, David Rowley wrote: >> Since range scan execution is rather different from the existing >> TidScan execution, I ended up making a new plan type, TidRangeScan. >> There is still only one TidPath, but it has an additional member that >> describes which method to use. > > I always thought that this would be implemented by overloading > TidScan. I thought that TidListEval() could be modified to remove > duplicates accounting for range scans. For example: > > SELECT * FROM t WHERE ctid BETWEEN '(0,1)' AND (0,10') OR ctid > IN('(0,5)','(0,30)'); > > would first sort all the tids along with their operator and then make > a pass over the sorted array to remove any equality ctids that are > redundant because they're covered in a range. Initially, I figured that 99% of the time, the user either wants to filter by a specific set of ctids (such as those returned by a subquery), or wants to process a table in blocks. Picking up an OR-set of ctid-conditions and determining which parts should be picked up row-wise (as in the existing code) versus which parts are blocks that should be scanned -- and ensuring that any overlaps were removed -- seemed more complicated than it was worth. Having thought about it, I think what you propose might be worth it; at least it limits us to a single TidScan plan to maintain. The existing code: - Looks for a qual that's an OR-list of (ctid = ?) or (ctid IN (?)) - Costs it by assuming each matching tuple is a separate page. - When beginning the scan, evaluates all the ?s and builds an array of tids to fetch. - Sorts and remove duplicates. - Iterates over the array, fetching tuples. So we'd extend that to: - Include in the OR-list "range" subquals of the form (ctid > ? AND ctid < ?) (either side could be optional, and we have to deal with >= and <= and having ctid on the rhs, etc.). - Cost the range subquals by assuming they don't overlap, and estimating how many blocks and tuples they span. - When beginning the scan, evaluate all the ?s and build an array of "tid ranges" to fetch. A tid range is a struct with a starting tid, and an ending tid, and might just be a single tid item. - Sort and remove duplicates. - Iterate over the array, using a single fetch for single-item tid ranges, and starting/ending a heap scan for multi-item tid ranges. I think I'll try implementing this. >> As part of the work I also taught TidScan that its results are ordered >> by ctid, i.e. to set a pathkey on a TidPath. The benefit of this is >> that queries such as >> >> SELECT MAX(ctid) FROM t; >> SELECT * FROM t WHERE ctid IN (...) ORDER BY ctid; > > I think that can be done as I see you're passing allow_sync as false > in heap_beginscan_strat(), so the scan will start at the beginning of > the heap. I found that heap scan caters to parallel scans, synchronised scans, and block range indexing; but it didn't quite work for my case of specifying a subset of a table and scanning backward or forward over it. Hence my changes. I'm not overly familiar with the heap scan code though. >> - Is there a less brittle way to create tables of a specific number >> of blocks/tuples in the regression tests? > > Perhaps you could just populate a table with some number of records > then DELETE the ones above ctid (x,100) on each page, where 100 is > whatever you can be certain will fit on a page on any platform. I'm > not quite sure if our regress test would pass with a very small block > size anyway, but probably worth verifying that before you write the > first test that will break it. I don't think I've tested with extreme block sizes. > I'll try to look in a bit more detail during the commitfest. > > It's perhaps a minor detail at this stage, but generally, we don't > have code lines over 80 chars in length. There are some exceptions, > e.g not breaking error message strings so that they're easily > greppable. src/tools/pgindent has a tool that you can run to fix the > whitespace so it's in line with project standard. I'll try to get pgindent running before my next patch. Thanks for the comments!
Re: Postgres, fsync, and OSs (specifically linux)
I was looking at the commitfest entry for feature (https://commitfest.postgresql.org/19/1639/) for the most recent list of patches to try out. The list doesn't look correct/complete. Can someone please check? Asim
Re: Facility for detecting insecure object naming
On Tue, Aug 14, 2018 at 03:00:55PM +, Robert Haas wrote: > The more I think about it, the more I think having a way to set a > lexically-scoped search path is probably the answer. [...] Yes please! This is what I want. Evaluate the search_path at function definition time, and record code with fully-qualified symbols in the catalog. Nico --
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
On Tue, Aug 14, 2018 at 2:07 PM, Todd A. Cook wrote: > Sorry, I just noticed this. Mantid is my animal, so I can set > "min_parallel_table_scan_size = 0" > on it if that would be helpful. (Please reply directly if so; I'm not able > to keep up with pgsql-hackers > right now.) We've already been able to rule that out as a factor here. Thanks all the same, though. -- Peter Geoghegan
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 8/14/18 10:05 AM, Tomas Vondra wrote: ... > I take that back - I can reproduce the crashes, both with and without the patch, all the way back to 9.6. Attached is a bunch of backtraces from various versions. There's a bit of variability depending on which pgbench script gets started first (insert.sql or vacuum.sql) - in one case (when vacuum is started before insert) the crash happens in InitPostgres/RelationCacheInitializePhase3, otherwise it happens in exec_simple_query. Another observation is that the failing COPY is not necessary, I can reproduce the crashes without this (so even with wal_level=replica). BTW I have managed to reproduce this on an old test server. If anyone wants to take a look, I can arrange ssh access to that machine (for trustworthy developers). Let me know. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
On 8/9/18, 12:56 AM, "Peter Geoghegan" wrote: On Wed, Aug 8, 2018 at 7:40 PM, Tom Lane wrote: >> Anyway, "VACUUM FULL pg_class" should be expected to corrupt >> pg_class_oid_index when we happen to get a parallel build, since >> pg_class is a mapped relation, and I've identified that as a problem >> for parallel CREATE INDEX [2]. If that was the ultimate cause of the >> issue, it would explain why only REL_11_STABLE and master are >> involved. > > Oooh ... but pg_class wouldn't be big enough to get a parallel > index rebuild during that test, would it? Typically not, but I don't think that we can rule it out right away. "min_parallel_table_scan_size = 0" will make it happen, and that happens in quite a few regression tests. Though that doesn't include vacuum.sql. The animals mandrill, mantid and lapwing are all "force_parallel_mode=regress", according to their notes. That actually isn't enough to force parallel CREATE INDEX, though I tend to think it should be. I don't see anything interesting in their "extra_config", but perhaps "min_parallel_table_scan_size = 0" got in some other way. I don't know all that much about the buildfarm client code, and it's late. Sorry, I just noticed this. Mantid is my animal, so I can set "min_parallel_table_scan_size = 0" on it if that would be helpful. (Please reply directly if so; I'm not able to keep up with pgsql-hackers right now.) -- todd
Re: Facility for detecting insecure object naming
> On Aug 14, 2018, at 8:00 AM, Robert Haas wrote: > > On Wed, Aug 8, 2018 at 1:58 PM, Tom Lane wrote: >> I'm not sold on #2 either. That path leads to, for example, >> s/=/OPERATOR(pg_catalog.=)/g everywhere, which is utterly catastrophic >> to both readability and portability of your SQL code. We *must* find >> a way to do better, not tell people that's what to do. >> >> When the security team was discussing this issue before, we speculated >> about ideas like inventing a function trust mechanism, so that attacks >> based on search path manipulations would fail even if they managed to >> capture an operator reference. I'd rather go down that path than >> encourage people to do more schema qualification. > > The more I think about it, the more I think having a way to set a > lexically-scoped search path is probably the answer. Many years ago, > Perl had only "local" as a way of declaring local variables, which > used dynamic scoping, and then they invented "my", which uses lexical > scoping, and everybody abandoned "local" and started using "my," > because otherwise a reference to a variable inside a procedure may > happen to refer to a local variable further up the call stack rather > than a global variable if the names happen to collide. It turns out > that not knowing to which object a reference will refer is awful, and > you should avoid it like the plague. This is exactly the same problem > we have with search_path. People define functions -- or index > expressions -- assuming that the function and operator references will > refer to the same thing at execution time that they do at definition > time. > > That is, I think, an entirely *reasonable* expectation. Basically all > programming languages work that way. If you could call a C function > in such a way that the meaning of identifiers that appeared there was > dependent on some piece of state inherited from the caller's context > rather than from the declarations visible at the time that the C > function was compiled, it would be utter and complete chaos. It would > in fact greatly resemble the present state of affairs in PostgreSQL, > where making your code reliably mean what you intended requires either > forcing the search path everywhere or schema-qualifying the living > daylights out of everything. I think we should try to follow Perl's > example, acknowledge that we basically got this wrong on the first > try, and invent something new that works better. I think you are interpreting the problem too broadly. This is basically just a privilege escalation attack vector. If there is a function that gets run under different privileges than those of the caller, and if the caller can coerce the function to do something with those elevated privileges that the function author did not intend, then the caller can do mischief. But if the caller is executing a function that operates under the caller's own permissions, then there is no mischief possible, since the caller can't trick the function to do anything that the caller couldn't do herself directly. For example, if there is a function that takes type anyarray and then performs operations over the elements of that array using operator=, there is no problem with me defining an operator= prior to calling that function and having my operator be the one that gets used, assuming the function doesn't escalate privileges. To my mind, the search_path should get set whenever SetUserIdAndSecContext or similar gets called and restored when SetUserIdAndSecContext gets called to restore the saved uid. Currently coded stuff like: GetUserIdAndSecContext(&saved_uid, &save_sec_context); ... if (saved_uid != authid_uid) SetUserIdAndSecContext(authid_uid, save_sec_context | SECURITY_LOCAL_USERID_CHANGE); ... SetUserIdAndSecContext(saved_uid, save_sec_context); need to have a third variable, saved_search_path or similar. Thoughts? mark
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
> On Aug 14, 2018, at 11:51 AM, Robert Haas wrote: > > On Mon, Aug 13, 2018 at 12:32 PM, Etsuro Fujita > wrote: >> One thing I noticed might be an improvement is to skip >> build_joinrel_partition_info if the given joinrel will be to have >> consider_partitionwise_join=false; in the previous patch, that function >> created the joinrel's partition info such as part_scheme and part_rels if >> the joinrel is considered as partitioned, independently of the flag >> consider_partitionwise_join for it, but if that flag is false, we don't >> generate PWJ paths for the joinrel, so we would not need to create that >> partition info at all. This would not only avoid unnecessary processing in >> that function, but also make unnecessary the changes I made to >> try_partitionwise_join, generate_partitionwise_join_paths, >> apply_scanjoin_target_to_paths, and create_ordinary_grouping_paths. So I >> updated the patch that way. Please find attached an updated version of the >> patch. > > I guess the question is whether there are (or might be in the future) > other dependencies on part_scheme. For example, it looks like > partition pruning uses it. I'm not sure whether partition pruning > supports a plan like: > > Append > -> Nested Loop > -> Seq Scan on p1 > -> Index Scan on q1 > > > If it doesn't, that's just an implementation restriction; somebody > might want to fix things so it works, if it doesn't already. While we (the RMT) are happy to see there has been progress on this thread, 11 Beta 3 was released containing this problem and the time to adequately test this feature prior to GA release is rapidly dwindling. The RMT would like to see this prioritized and resolved. At this point we have considered the option of having partitionwise joins completely disabled in PostgreSQL 11. This is preferably not the path we want to choose, but this option is now on the table and we will issue an ultimatum soon if we do not see further progress. In the previous discussion, the generally accepted solution for PostgreSQL 11 seemed to be to disable the problematic case and any further work would be for PostgreSQL 12 only. If something is different, please indicate why ASAP and work to resolve. The RMT will also conduct some additional technical analysis as well. Sincerely, Alvaro, Andres, Jonathan signature.asc Description: Message signed with OpenPGP
Re: libpq should not look up all host addresses at once
Hello Garick, I read the rational of the host/hostaddr artificial mapping. I cannot say I'm thrilled with the result: I do not really see a setting where avoiding a DNS query is required but which still needs a hostname for auth... If you have GSSAPI or SSPI then you have an underlying network, in which a dns query should be fine. FWIW, I think this is useful even it will be uncommon to use. I run some HA services here and I find I use this kind of functionality all the time to test if a standby node functioning properly. Ok, I understand that you want to locally override DNS results for testing purposes. Interesting use case. Anyway, if it's not a big burden, I suggest you keep it, IIUC. I would not suggest to remove the feature, esp if there is a reasonable use case. This kind of thing is really handy especially since today's cloudy-stuff means one often gets all-the-nat whether one wants it or not. Yep, NAT, another possible use case. So it is a useful feature for specialized use cases! I just lack imagination:-) Thanks for the explanations. Maybe I'll try to just improve the the documentation. -- Fabien.
Re: [HACKERS] pgbench - allow to store select results into variables
On 2018-Aug-14, Fabien COELHO wrote: > > (At least, that's true with my preferred web browser, maybe it's > > different for other people?) > > So if I send with text/x-diff or text/plain I've got complaints, if I send > with application/octet-stream, it is not right either:-) Everybody being > somehow right. I like that I can look at the text/* ones directly in the browser instead of having to download, but I can handle whatever (and I expect the same for most people, except maybe those who work directly on Windows). I just wish people would not send tarballs, which aren't as comfy to page through with "zcat | cdiff" ... -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] pgbench - allow to store select results into variables
I have no doubt that some MUA around would forget to do the conversion. FWIW, one reason that I invariably use patch(1) to apply submitted patches is that it will take care of stripping any CRs that may have snuck in. So I'm not particularly fussed about the problem. Good to know. I'm not excited about encouraging people to use application/octet-stream rather than text/something for submitted patches. I'm not happy with that either, it is just to avoid complaints. If you use text then the patch can easily be examined in the web archives; with application/octet-stream the patch has to be downloaded, adding a lot of manual overhead. Indeed. (At least, that's true with my preferred web browser, maybe it's different for other people?) So if I send with text/x-diff or text/plain I've got complaints, if I send with application/octet-stream, it is not right either:-) Everybody being somehow right. Sigh. -- Fabien.
Re: Pre-v11 appearances of the word "procedure" in v11 docs
On Tue, Aug 14, 2018 at 1:39 PM, Tom Lane wrote: > We're kinda stuck with that. We could add FUNCTION as a preferred > synonym, perhaps, but I doubt that'd really be worth the trouble. Seems sufficient to note in the relevant docs that it's a legacy thing. -- Peter Geoghegan
Re: Facility for detecting insecure object naming
On Tue, Aug 14, 2018 at 04:23:33PM -0400, Robert Haas wrote: > On Tue, Aug 14, 2018 at 3:31 PM, Bruce Momjian wrote: > > Well, in C, if the calling function is not in the current C file, you > > really don't know what function you are linking to --- it depends on > > what other object files are linked into the executable. > > True. > > > I am unclear how lexical scoping helps with this, or with Postgres. > > Well, if you say sqrt(x) in C, and you don't have sqrt or x defined in > the current function, you know that you're going to call a global > function called sqrt and pass to it a global variable called x. > You're not going to latch onto a local variable called x in the > function that called your function, and if the calling function has a > local variable called sqrt, that doesn't matter either. The linker > can point every called to sqrt() in the program at a different place > than expected, but you can't monkey with the behavior of an individual > call by having the calling function declare identifiers that mask the > ones the function intended to reference. What we are doing is more like C++ virtual functions, where the class calling it can replace function calls in subfunctions: https://www.geeksforgeeks.org/virtual-function-cpp/ > On the other hand, when you call an SQL-language function, or even to > some extent a plpgsql-language function, from PostgreSQL, you can in > fact change the meanings of every identifier that appears in that > function body unless those references are all schema-qualified or the > function sets the search path. If an SQL-language function calls I think we decide that search path alone for functions is insufficient because of data type matching, unless the schema is secure. > sqrt(x), you can set the search_path to some schema you've created > where there is a completely different sqrt() function than that > function intended to call. That is a very good way to make stuff (a) > break or (b) be insecure. > > The point of having a lexically-scoped search path is that it is > generally the case that when you write SQL code, you know the search > path under which you want it to execute. You can get that behavior > today by attaching a SET clause to the function, but that defeats > inlining, is slower, and the search path you configure applies not > only to the code to which is attached but to any code which it in turn > calls. In other words, the search path setting which you configure > has dynamic scope. I submit that's bad. Functions can reasonably be > expected to know the search path that should be used to run the code > they actually contain, but they cannot and should not need to know the > search path that should be used to run code in other functions which > they happen to call. So you are saying PG functions should lock down their search path at function definition time, and use that for all function invocations? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: Pre-v11 appearances of the word "procedure" in v11 docs
Peter Eisentraut writes: > What should we do with the use of the keyword PROCEDURE in the CREATE > OPERATOR and CREATE TRIGGER syntaxes? We're kinda stuck with that. We could add FUNCTION as a preferred synonym, perhaps, but I doubt that'd really be worth the trouble. regards, tom lane
Re: Facility for detecting insecure object naming
On Tue, Aug 14, 2018 at 3:31 PM, Bruce Momjian wrote: > Well, in C, if the calling function is not in the current C file, you > really don't know what function you are linking to --- it depends on > what other object files are linked into the executable. True. > I am unclear how lexical scoping helps with this, or with Postgres. Well, if you say sqrt(x) in C, and you don't have sqrt or x defined in the current function, you know that you're going to call a global function called sqrt and pass to it a global variable called x. You're not going to latch onto a local variable called x in the function that called your function, and if the calling function has a local variable called sqrt, that doesn't matter either. The linker can point every called to sqrt() in the program at a different place than expected, but you can't monkey with the behavior of an individual call by having the calling function declare identifiers that mask the ones the function intended to reference. On the other hand, when you call an SQL-language function, or even to some extent a plpgsql-language function, from PostgreSQL, you can in fact change the meanings of every identifier that appears in that function body unless those references are all schema-qualified or the function sets the search path. If an SQL-language function calls sqrt(x), you can set the search_path to some schema you've created where there is a completely different sqrt() function than that function intended to call. That is a very good way to make stuff (a) break or (b) be insecure. The point of having a lexically-scoped search path is that it is generally the case that when you write SQL code, you know the search path under which you want it to execute. You can get that behavior today by attaching a SET clause to the function, but that defeats inlining, is slower, and the search path you configure applies not only to the code to which is attached but to any code which it in turn calls. In other words, the search path setting which you configure has dynamic scope. I submit that's bad. Functions can reasonably be expected to know the search path that should be used to run the code they actually contain, but they cannot and should not need to know the search path that should be used to run code in other functions which they happen to call. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: libpq should not look up all host addresses at once
On Tue, Aug 14, 2018 at 03:18:32PM -0400, Garick Hamlin wrote: > On Tue, Aug 14, 2018 at 12:24:32PM +0200, Fabien COELHO wrote: > > I read the rational of the host/hostaddr artificial mapping. I cannot say > > I'm thrilled with the result: I do not really see a setting where avoiding a > > DNS query is required but which still needs a hostname for auth... If you > > have GSSAPI or SSPI then you have an underlying network, in which a dns > > query should be fine. > > FWIW, I think this is useful even it will be uncommon to use. I run > some HA services here and I find I use this kind of functionality all > the time to test if a standby node functioning properly. openssh > GSSAPIServerIdentity does this. curl does this via '--resolve'. In > both cases one can check the name authenticates properly via TLS or > GSSAPI while connecting to an IP that is not production. +1 curl's --resolve is a fantastic diagnostic tool. I wish it also allowed changing the destination port as well. While I'm at it, I strongly prefer using postgresql: URIs to any other way to specify connect info, and I think PG should do more to encourage their use -- perhaps even deprecating the alternatives. Nico --
Re: Pre-v11 appearances of the word "procedure" in v11 docs
What should we do with the use of the keyword PROCEDURE in the CREATE OPERATOR and CREATE TRIGGER syntaxes? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Undocumented(?) limits on regexp functions
> On Aug 14, 2018, at 10:01 AM, Tels wrote: > > Moin Andrew, > > On Tue, August 14, 2018 9:16 am, Andrew Gierth wrote: >>> "Tom" == Tom Lane writes: >> Should these limits: >> a) be removed >> >> Tom> Doubt it --- we could use the "huge" request variants, maybe, but >> Tom> I wonder whether the engine could run fast enough that you'd want >> Tom> to. >> >> I do wonder (albeit without evidence) whether the quadratic slowdown >> problem I posted a patch for earlier was ignored for so long because >> people just went "meh, regexps are slow" rather than wondering why a >> trivial splitting of a 40kbyte string was taking more than a second. > > Pretty much this. :) > > First of all, thank you for working in this area, this is very welcome. > > We do use UTF-8 and we did notice that regexp are not actually the fastest > around, albeit we did not (yet) run into the memory limit. Mostly, because > the regexp_match* stuff we use is only used in places where the > performance is not key and the input/output is small (albeit, now that I > mention it, the quadratic behaviour might explain a few slowdowns in other > cases I need to investigate). > > Anyway, in a few places we have functions that use a lot (> a dozend) > regexps that are also moderate complex (e.g. span multiple lines). In > these cases the performance was not really up to par, so I experimented > and in the end rewrote the functions in plperl. Which fixed the > performance, so we no longer had this issue. +1. I have done something similar, though in C rather than plperl. As for the length limit, I have only hit that in stress testing, not in practice. mark
Re: Facility for detecting insecure object naming
On Tue, Aug 14, 2018 at 03:00:55PM +, Robert Haas wrote: > The more I think about it, the more I think having a way to set a > lexically-scoped search path is probably the answer. Many years ago, > Perl had only "local" as a way of declaring local variables, which > used dynamic scoping, and then they invented "my", which uses lexical > scoping, and everybody abandoned "local" and started using "my," > because otherwise a reference to a variable inside a procedure may > happen to refer to a local variable further up the call stack rather > than a global variable if the names happen to collide. It turns out > that not knowing to which object a reference will refer is awful, and > you should avoid it like the plague. This is exactly the same problem > we have with search_path. People define functions -- or index > expressions -- assuming that the function and operator references will > refer to the same thing at execution time that they do at definition > time. > > That is, I think, an entirely *reasonable* expectation. Basically all > programming languages work that way. If you could call a C function > in such a way that the meaning of identifiers that appeared there was > dependent on some piece of state inherited from the caller's context > rather than from the declarations visible at the time that the C > function was compiled, it would be utter and complete chaos. It would > in fact greatly resemble the present state of affairs in PostgreSQL, > where making your code reliably mean what you intended requires either > forcing the search path everywhere or schema-qualifying the living > daylights out of everything. I think we should try to follow Perl's > example, acknowledge that we basically got this wrong on the first > try, and invent something new that works better. Well, in C, if the calling function is not in the current C file, you really don't know what function you are linking to --- it depends on what other object files are linked into the executable. I am unclear how lexical scoping helps with this, or with Postgres. With Postgres you are effectively adding functions into the executable that didn't exist at link time, and potentially replacing compiled-in functions with others that might match the call site better. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: libpq should not look up all host addresses at once
On Tue, Aug 14, 2018 at 12:24:32PM +0200, Fabien COELHO wrote: > > Hello Tom, > > >>As you noted in another message, a small doc update should be needed. > > > >Check. Proposed doc patch attached. (Only the last hunk is actually > >specific to this patch, the rest is cleanup that I noticed while looking > >around for possibly-relevant text.) > > Doc build is ok. > > Some comments that you may not find all useful, please accept my apology, it > just really shows that I read your prose in some detail:-) > > The mention of possible reverse dns queries has been removed... but I do not > think there was any before? There could be if only hostaddr is provided but > a hostname is required by some auth, but it does not seem to be the case > according to the documentation. > > I read the rational of the host/hostaddr artificial mapping. I cannot say > I'm thrilled with the result: I do not really see a setting where avoiding a > DNS query is required but which still needs a hostname for auth... If you > have GSSAPI or SSPI then you have an underlying network, in which a dns > query should be fine. FWIW, I think this is useful even it will be uncommon to use. I run some HA services here and I find I use this kind of functionality all the time to test if a standby node functioning properly. openssh GSSAPIServerIdentity does this. curl does this via '--resolve'. In both cases one can check the name authenticates properly via TLS or GSSAPI while connecting to an IP that is not production. The IP might float via VRRP or EIP in AWS, or it might be a service local OOB network and the frontend might be a load balancer like haproxy. FWIW, I am not using this for PG today, but this kind of feature is definitely nice to have for alarming and HA. It lets proper analysis happen. This way not everyone to be called when the local DNS resolver fails and just the DNS-people can get the 2am call. Anyway, if it's not a big burden, I suggest you keep it, IIUC. This kind of thing is really handy especially since today's cloudy-stuff means one often gets all-the-nat whether one wants it or not. Garick
Re: [HACKERS] pgbench - allow to store select results into variables
On 08/14/2018 01:44 PM, Tom Lane wrote: Fabien COELHO writes: I have no doubt that some MUA around would forget to do the conversion. FWIW, one reason that I invariably use patch(1) to apply submitted patches is that it will take care of stripping any CRs that may have snuck in. So I'm not particularly fussed about the problem. I also use patch(1), although probably mainly out of laziness in not changing habits going back to the CVS days ;-) You obviously commit far more patches that I do, but I don't normally see patch complaining about CRs, which is why I raised the issue. I'm not excited about encouraging people to use application/octet-stream rather than text/something for submitted patches. If you use text then the patch can easily be examined in the web archives; with application/octet-stream the patch has to be downloaded, adding a lot of manual overhead. (At least, that's true with my preferred web browser, maybe it's different for other people?) You might have a point. Compressed patched can be particularly annoying. Most other things my browser will download and pop unto geany for me. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [patch] Duplicated pq_sendfloat4/8 prototypes
Christoph Berg writes: > The pq_sendfloat4 and 8 prototypes are duplicated. Yup. Pushed, thanks! regards, tom lane
Re: [HACKERS] pgbench - allow to store select results into variables
On Tue, Aug 14, 2018 at 1:44 PM, Tom Lane wrote: > Fabien COELHO writes: >> I have no doubt that some MUA around would forget to do the conversion. > > FWIW, one reason that I invariably use patch(1) to apply submitted patches > is that it will take care of stripping any CRs that may have snuck in. > So I'm not particularly fussed about the problem. Yeah. I think that we shouldn't care about this, or about context/unified diffs, or really anything other than that patch can apply it. Once you apply it, you can issue the correct incantation to see it in whatever format you prefer. If it's a whole patch stack, it makes sense to use 'git format-patch' to generate the patches, because then it's a lot easier to apply the whole stack, but for a single patch it really doesn't matter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Pre-v11 appearances of the word "procedure" in v11 docs
On Tue, Aug 14, 2018 at 10:18 AM, Tom Lane wrote: > Agreed, we'd better stop being cavalier about whether that's an > exact synonym or not. I made an open item for this. -- Peter Geoghegan
Re: [HACKERS] pgbench - allow to store select results into variables
Fabien COELHO writes: > I have no doubt that some MUA around would forget to do the conversion. FWIW, one reason that I invariably use patch(1) to apply submitted patches is that it will take care of stripping any CRs that may have snuck in. So I'm not particularly fussed about the problem. I'm not excited about encouraging people to use application/octet-stream rather than text/something for submitted patches. If you use text then the patch can easily be examined in the web archives; with application/octet-stream the patch has to be downloaded, adding a lot of manual overhead. (At least, that's true with my preferred web browser, maybe it's different for other people?) regards, tom lane
Re: libpq should not look up all host addresses at once
Fabien COELHO writes: > The mention of possible reverse dns queries has been removed... but I do > not think there was any before? Yeah, that's why I took it out. If there ever were any reverse lookups in libpq, we got rid of them AFAICS, so this statement in the docs seemed just unnecessarily confusing to me. > I read the rational of the host/hostaddr artificial mapping. I cannot say > I'm thrilled with the result: I do not really see a setting where avoiding > a DNS query is required but which still needs a hostname for auth... If > you have GSSAPI or SSPI then you have an underlying network, in which a > dns query should be fine. The point is that you might wish to avoid a DNS query for speed reasons, not because it's "required" to do so. Or, perhaps, you did the DNS query already using some async DNS library, and you want to pass the result on to PQconnectStart so it will not block. > STM that we should have had only "host" for specifying the target (name, > ip, path), and if really necessary an ugly "authname" directive to provide > names on the side. I know, too late. > I think that in the string format host=foo:5433,bla:5432 could be > accepted as it is in the URL format. I'm uninterested in these proposals, and they certainly are not something to include in this particular patch even if I were. Also, it's too late to redefine the meaning of colon in a host string, since as you mentioned that could already be an IPv6 address. >>> I'd consider wrapping some of the logic. I'd check the port first, then >>> move the host resolution stuff into a function. >> Don't really see the value of either ... > What I see is that the host logic is rather lengthy and specific (it > depends on the connection type -- name, ip, socket including a so nice > #ifdef), distinct from the port stuff which is dealt with quite quickcly. > By separating the port & host and putting the lengthy stuff in a function > the scope of the for loop on potential connections would be easier to read > and understand, and would not require to see CHT_ cases to understand what > is being done for managing "host" variants, which is a mere detail. I still don't see the value of it. This code was inlined into the calling function before, and it still is with this patch --- just a different calling function. Maybe there's an argument for a wholesale refactoring of PQconnectPoll into smaller pieces, but that's not a task for this patch either. (I'm not really convinced it'd be an improvement anyway, at least not enough of one to justify creating so much back-patching pain.) regards, tom lane
Re: Stored procedures and out parameters
Hi, On 2018-08-12 08:51:28 +0100, Shay Rojansky wrote: > Peter, Tom, > > Would it be possible for you to review the following two questions? Some > assertions have been made in this thread about the new stored procedures > (support for dynamic and multiple resultsets) whose compatibility with the > current PostgreSQL protocol are unclear to me as a client driver > maintainer... Some clarification would really help. I've not yet discussed this with the rest of the RMT, but to me it sounds like we should treat this an open item for the release. We shouldn't have the wire protocol do something nonsensical and then do something different in the next release. - Andres
Re: Pre-v11 appearances of the word "procedure" in v11 docs
Peter Geoghegan writes: > I noticed that the word "procedure" appears in at least a few places > in the v11 docs as a not-quite-apt synonym of "function". For example, > https://www.postgresql.org/docs/11/static/plpgsql-trigger.html talks > about "PL/pgSQL Trigger Procedures" which are actually all functions > in practice. Agreed, we'd better stop being cavalier about whether that's an exact synonym or not. regards, tom lane
Re: libpq connection timeout mismanagement
Fabien COELHO writes: >> Right. As before, I'm not excited about rejecting trailing junk, >> considering we never did before. > My 0.02¤: accepting trailing junk is just a recipee for hiding bugs: > sh> psql "connect_timeout=2,port=5433" > psql> \conninfo > ... port=5432 # port directive was silently ignored > So erroring out would be a good thing, and it could be done on a new > release. Perhaps. That should certainly be a separate patch, though, and it should cover everything not just connection_timeout (I see at least five parameters that are parsed with atoi() and nothing else). I'm still not excited, but feel free to send a patch. regards, tom lane
Re: Undocumented(?) limits on regexp functions
Moin Andrew, On Tue, August 14, 2018 9:16 am, Andrew Gierth wrote: >> "Tom" == Tom Lane writes: > > >> Should these limits: > > >> a) be removed > > Tom> Doubt it --- we could use the "huge" request variants, maybe, but > Tom> I wonder whether the engine could run fast enough that you'd want > Tom> to. > > I do wonder (albeit without evidence) whether the quadratic slowdown > problem I posted a patch for earlier was ignored for so long because > people just went "meh, regexps are slow" rather than wondering why a > trivial splitting of a 40kbyte string was taking more than a second. Pretty much this. :) First of all, thank you for working in this area, this is very welcome. We do use UTF-8 and we did notice that regexp are not actually the fastest around, albeit we did not (yet) run into the memory limit. Mostly, because the regexp_match* stuff we use is only used in places where the performance is not key and the input/output is small (albeit, now that I mention it, the quadratic behaviour might explain a few slowdowns in other cases I need to investigate). Anyway, in a few places we have functions that use a lot (> a dozend) regexps that are also moderate complex (e.g. span multiple lines). In these cases the performance was not really up to par, so I experimented and in the end rewrote the functions in plperl. Which fixed the performance, so we no longer had this issue. All the best, Tels
[patch] Duplicated pq_sendfloat4/8 prototypes
The pq_sendfloat4 and 8 prototypes are duplicated. Christoph diff --git a/src/include/libpq/pqformat.h b/src/include/libpq/pqformat.h index 6acb2e8de0..f0337325bb 100644 --- a/src/include/libpq/pqformat.h +++ b/src/include/libpq/pqformat.h @@ -31,9 +31,6 @@ extern void pq_send_ascii_string(StringInfo buf, const char *str); extern void pq_sendfloat4(StringInfo buf, float4 f); extern void pq_sendfloat8(StringInfo buf, float8 f); -extern void pq_sendfloat4(StringInfo buf, float4 f); -extern void pq_sendfloat8(StringInfo buf, float8 f); - /* * Append a [u]int8 to a StringInfo buffer, which already has enough space * preallocated.
Re: Temporary tables prevent autovacuum, leading to XID wraparound
On Mon, Aug 13, 2018 at 01:53:18PM -0300, Alvaro Herrera wrote: > I do share Andres' concerns on the wording the comment. I would say > something like > > /* > * Reset the temporary namespace flag in MyProc. We assume this to be > * an atomic assignment. > * > * Because this subtransaction is rolling back, the pg_namespace > * row is not visible to anyone else anyway, but that doesn't matter: > * it's not a problem if objects contained in this namespace are removed > * concurrently. > */ > The fact of assignment being atomic and the fact of the pg_namespace row > being visible are separately important. You care about it being atomic > because it means you must not have someone read "16" (0x10) when you > were partway removing the value "65552" (0x10010), thus causing that > someone removing namespace 16. And you care about the visibility of the > pg_namespace row because of whether you're worried about a third party > removing the tables from that namespace or not: since the subxact is > aborting, you are not. I was thinking about adding "Even if it is not atomic" or such at the beginning of the paragraph, but at the end your phrasing sounds better to me. So I have hacked up the attached, which also reworks the comment in InitTempTableNamespace in the same spirit. Thoughts? -- Michael diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index 3971346e73..bd8ae0ae4b 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -3938,8 +3938,10 @@ InitTempTableNamespace(void) * decide if a temporary namespace is in use or not. We assume that * assignment of namespaceId is an atomic operation. Even if it is not, * the temporary relation which resulted in the creation of this temporary - * namespace is still locked until the current transaction commits, so it - * would not be accessible yet, acting as a barrier. + * namespace is still locked until the current transaction commits, and + * its pg_namespace row is not visible yet. However it does not matter: + * this flag makes the namespace as being in use, so no objects created + * on it would be removed concurrently. */ MyProc->tempNamespaceId = namespaceId; @@ -3976,10 +3978,12 @@ AtEOXact_Namespace(bool isCommit, bool parallel) /* * Reset the temporary namespace flag in MyProc. We assume that - * this operation is atomic. Even if it is not, the temporary - * table which created this namespace is still locked until this - * transaction aborts so it would not be visible yet, acting as a - * barrier. + * this operation is atomic. + * + * Because this transaction is aborting, the pg_namespace row is + * not visible to anyone else anyway, but that doesn't matter: + * it's not a problem if objects contained in this namespace are + * removed concurrently. */ MyProc->tempNamespaceId = InvalidOid; } @@ -4037,10 +4041,12 @@ AtEOSubXact_Namespace(bool isCommit, SubTransactionId mySubid, /* * Reset the temporary namespace flag in MyProc. We assume that - * this operation is atomic. Even if it is not, the temporary - * table which created this namespace is still locked until this - * transaction aborts so it would not be visible yet, acting as a - * barrier. + * this operation is atomic. + * + * Because this subtransaction is aborting, the pg_namespace row + * is not visible to anyone else anyway, but that doesn't matter: + * it's not a problem if objects contained in this namespace are + * removed concurrently. */ MyProc->tempNamespaceId = InvalidOid; } signature.asc Description: PGP signature
Re: Temporary tables prevent autovacuum, leading to XID wraparound
On Mon, Aug 13, 2018 at 02:15:07PM -0300, Alvaro Herrera wrote: > If writing an OID were not atomic, the assignment would be really > dangerous. I don't think your proposed update is good. OK, I am withdrawing this one then. -- Michael signature.asc Description: PGP signature
Re: Repeatable Read Isolation in SQL running via background worker
On Mon, Aug 13, 2018 at 10:52 AM, Jeremy Finzel wrote: > On Thu, Aug 9, 2018 at 4:34 PM, Jeremy Finzel wrote: >> I am using worker_spi as a model to run a SQL statement inside a >> background worker. From my browsing of the Postgres library, I believe that >> if I want repeatable read isolation level, the proper way for me to attain >> this is to add this line after StartTransactionCommand() in worker_spi_main: >> >> XactIsoLevel = XACT_REPEATABLE_READ; It's usually a good idea to only change GUCs through the GUC machinery i.e. use SetConfigOption(). Are you using StartTransactionCommand() and CommitTransactionCommand() to manage transaction boundaries? If not, maybe you should. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Improve behavior of concurrent ANALYZE/VACUUM
On Tue, Aug 14, 2018 at 03:26:29PM +, Robert Haas wrote: > I feel like you're not being very clear about exactly what this new > approach is. Sorry if I'm being dense. On HEAD, we check the ownership of the relation vacuumed or analyzed after taking a lock on it in respectively vacuum_rel() and analyze_rel(), where we already know the OID of the relation and there may be no RangeVar which we could use with RangeVarGetRelidExtended (like partitions). I don't think that we want to use again RangeVarGetRelidExtended once the relation OID is known anyway. So My proposal is to add more ownership checks when building the list of VacuumRelations, and skip the relations the user has no right to work on at an earlier stage. Looking at the code may be easier to understand than a comment, please remember that there are three code paths used to build the list of VacuumRelations (each one may be processed in its own transaction): 1) autovacuum, which specifies only one relation at a time with its OID, and we build the list in expand_vacuum_rel(), which finishes with a single element. 2) Manual VACUUM with a list of relation specified, where the list of elements is built in the second part of expand_vacuum_rel(), which is able to expand partitioned tables as well. 3) Manual VACUUM with no list specified, where the list of relations is built in get_all_vacuum_rels(). My proposal is to add two more ownership checks in 2) and 3), and also refactor the code so as we use a single routine for ANALYZE and VACUUM. This has the advantage of not making the code of ANALYZE and VACUUM diverge anymore for the existing ownership checks, and we still generate WARNINGs if a relation needs to be skipped. (Thinking about it, it could make sense to add an extra assert at the beginning of expand_vacuum_rel like I did in 6551f3d...) -- Michael signature.asc Description: PGP signature
Re: InsertPgAttributeTuple() and attcacheoff
On Tue, Aug 14, 2018 at 3:50 PM, Tom Lane wrote: > Peter Eisentraut writes: >> It seems to me that it would make sense if InsertPgAttributeTuple() were >> to set attcacheoff to -1 instead of taking it from the caller. > > Looked this over, no objections. > > I wonder whether we should set that field to -1 when we *read* > pg_attribute rows from disk, and be less fussed about what gets written > out. The only real advantage is that this'd protect us from foolish > manual changes to pg_attribute.attcacheoff entries, but that doesn't > seem negligible. I wouldn't object to forcibly writing in -1 when we read the data, but I don't think it's a good idea to let values other than -1 get written to the disk. User-visible random nonsense in system catalogs seems like too much of a foot-gun to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
On Mon, Aug 13, 2018 at 12:32 PM, Etsuro Fujita wrote: > One thing I noticed might be an improvement is to skip > build_joinrel_partition_info if the given joinrel will be to have > consider_partitionwise_join=false; in the previous patch, that function > created the joinrel's partition info such as part_scheme and part_rels if > the joinrel is considered as partitioned, independently of the flag > consider_partitionwise_join for it, but if that flag is false, we don't > generate PWJ paths for the joinrel, so we would not need to create that > partition info at all. This would not only avoid unnecessary processing in > that function, but also make unnecessary the changes I made to > try_partitionwise_join, generate_partitionwise_join_paths, > apply_scanjoin_target_to_paths, and create_ordinary_grouping_paths. So I > updated the patch that way. Please find attached an updated version of the > patch. I guess the question is whether there are (or might be in the future) other dependencies on part_scheme. For example, it looks like partition pruning uses it. I'm not sure whether partition pruning supports a plan like: Append -> Nested Loop -> Seq Scan on p1 -> Index Scan on q1 If it doesn't, that's just an implementation restriction; somebody might want to fix things so it works, if it doesn't already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: InsertPgAttributeTuple() and attcacheoff
Peter Eisentraut writes: > It seems to me that it would make sense if InsertPgAttributeTuple() were > to set attcacheoff to -1 instead of taking it from the caller. Looked this over, no objections. I wonder whether we should set that field to -1 when we *read* pg_attribute rows from disk, and be less fussed about what gets written out. The only real advantage is that this'd protect us from foolish manual changes to pg_attribute.attcacheoff entries, but that doesn't seem negligible. regards, tom lane
Re: [HACKERS] Bug in to_timestamp().
On Thu, Aug 2, 2018 at 9:06 PM Alexander Korotkov wrote: > On Thu, Aug 2, 2018 at 6:17 PM Alexander Korotkov > wrote: > > After some experiments I found that when you mix spaces and separators > > between two fields, then Oracle takes into account only length of last > > group of spaces/separators. > > > > # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM > > dual2018-01-01 00:00:00 -10:00 > > (length of last spaces/separators group is 2) > > > > # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM dual > > 2018-01-01 00:00:00 -10:00 > > (length of last spaces/separators group is 3) > > > > # SELECT to_timestamp('2018- -01 02', ' -- ---MM-DD') FROM dual > > 02.01.2018 00:00:00 > > (length of last spaces/separators group is 2) > > Ooops... I'm sorry, but I've posted wrong results here. Correct > version is here. > > # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM dual > ORA-01843: not a valid month > (length of last spaces/separators group is 2) > > # SELECT to_timestamp('2018- -01 02', ' -- ---MM-DD') FROM dual > 02.01.2018 00:00:00 > (length of last spaces/separators group is 3) > > So length of last group of spaces/separators in the pattern should be > greater or equal to length of spaces/separators in the input string. > Other previous groups are ignored in Oracle. And that seems > ridiculous for me. BTW, I've also revised documentation and regression tests. Patch is attached. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company 0001-to-timestamp-format-checking-v15.patch Description: Binary data
Re: Improve behavior of concurrent ANALYZE/VACUUM
On Sun, Aug 12, 2018 at 10:21 PM, Michael Paquier wrote: > In the previous thread, we discussed a couple of approaches, but I was > not happy with any of those, hence I have been spending more time in > getting to a solution which has no user-facing changes, and still solves > the problems folks have been complaining about, and the result is the > patch attached. The patch changes a couple of things regarding ACL > checks, by simply centralizing the ownership checks into a single > routine used by both ANALYZE and VACUUM. This routine is then used in > two more places for manual ANALYZE and VACUUM: > - When specifying directly one or more relations in the command, in > expand_vacuum_rel(). > - When building the complete list of relations to work on in the case of > a database-wide operation, in get_all_vacuum_rels(). I feel like you're not being very clear about exactly what this new approach is. Sorry if I'm being dense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Tid scan improvements
On Sun, Aug 12, 2018 at 8:07 AM, David Rowley wrote: > Thanks for working on this. It will great to see improvements made in this > area. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Facility for detecting insecure object naming
On Wed, Aug 8, 2018 at 1:58 PM, Tom Lane wrote: > I'm not sold on #2 either. That path leads to, for example, > s/=/OPERATOR(pg_catalog.=)/g everywhere, which is utterly catastrophic > to both readability and portability of your SQL code. We *must* find > a way to do better, not tell people that's what to do. > > When the security team was discussing this issue before, we speculated > about ideas like inventing a function trust mechanism, so that attacks > based on search path manipulations would fail even if they managed to > capture an operator reference. I'd rather go down that path than > encourage people to do more schema qualification. The more I think about it, the more I think having a way to set a lexically-scoped search path is probably the answer. Many years ago, Perl had only "local" as a way of declaring local variables, which used dynamic scoping, and then they invented "my", which uses lexical scoping, and everybody abandoned "local" and started using "my," because otherwise a reference to a variable inside a procedure may happen to refer to a local variable further up the call stack rather than a global variable if the names happen to collide. It turns out that not knowing to which object a reference will refer is awful, and you should avoid it like the plague. This is exactly the same problem we have with search_path. People define functions -- or index expressions -- assuming that the function and operator references will refer to the same thing at execution time that they do at definition time. That is, I think, an entirely *reasonable* expectation. Basically all programming languages work that way. If you could call a C function in such a way that the meaning of identifiers that appeared there was dependent on some piece of state inherited from the caller's context rather than from the declarations visible at the time that the C function was compiled, it would be utter and complete chaos. It would in fact greatly resemble the present state of affairs in PostgreSQL, where making your code reliably mean what you intended requires either forcing the search path everywhere or schema-qualifying the living daylights out of everything. I think we should try to follow Perl's example, acknowledge that we basically got this wrong on the first try, and invent something new that works better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: libpq compression
Hi Robert, First of all thank you for review and your time spent on analyzing this patch. My comments are inside. On 13.08.2018 21:47, Robert Haas wrote: On Fri, Aug 10, 2018 at 5:55 PM, Andrew Dunstan wrote: This thread appears to have gone quiet. What concerns me is that there appears to be substantial disagreement between the author and the reviewers. Since the last thing was this new patch it should really have been put back into "needs review" (my fault to some extent - I missed that). So rather than return the patch with feedfack I'm going to set it to "needs review" and move it to the next CF. However, if we can't arrive at a consensus about the direction during the next CF it should be returned with feedback. I agree with the critiques from Robbie Harwood and Michael Paquier that the way in that compression is being hooked into the existing architecture looks like a kludge. I'm not sure I know exactly how it should be done, but the current approach doesn't look natural; it looks like it was bolted on. Sorry, I know that it is too impertinently to ask you or somebody else to suggest better way of integration compression in libpq frontend/backend. My primary intention was to use the same code both for backend and frontend. This is why I pass pointer to receive/send function to compression module. I am passing secure_read/secure_write function for backend and pqsecure_read/pqsecure_write functions for frontend. And change pq_recvbuf/pqReadData, internal_flush/pqSendSome functions to call zpq_read/zpq_write functions when compression is enabled. Robbie thinks that compression should be toggled in secure_write/secure_write/pqsecure_read/pqsecure_write. I do not understand why it is better then my implementation. From my point of view it just introduce extra code redundancy and has no advantages. Just name of this functions (secure_*) will be confusing if this function will handle compression as well. May be it is better to introduce some other set of function which in turn will secuire_* functions, but I also do no think that it is good idea. I agree with the critique from Peter Eisentraut and others that we should not go around and add -Z as a command-line option to all of our tools; this feature hasn't yet proved itself to be useful enough to justify that. Better to let people specify it via a connection string option if they want it. It is not a big deal to remove command line options. My only concern was that if somebody want to upload data using psql+COPY command, it will be more difficult to completely change the way of invoking psql in this case rather than just adding one option (most of user are using -h -D -U options rather than passing complete connection string). I think Thomas Munro was right to ask about what will happen when different compression libraries are in use, and I think failing uncleanly is quite unacceptable. I think there needs to be some method for negotiating the compression type; the client can, for example, provide a comma-separated list of methods it supports in preference order, and the server can pick the first one it likes. I will add checking of supported compression method. Right now Postgres can be configured to use either zlib, either zstd streaming compression, but not both of them. So choosing appreciate compression algorithm is not possible, I can only check that client and server are supporting the same compression method. Certainly it is possible to implement support of different compression method and make it possible to chose on at runtime, but I do not think that it is really good idea unless we want to support custom compression methods. In short, I think that a number of people have provided really good feedback on this patch, and I suggest to Konstantin that he should consider accepting all of those suggestions. Commit ae65f6066dc3d19a55f4fdcd3b30003c5ad8dbed tried to introduce some facilities that can be used for protocol version negotiation as new features are added, but this patch doesn't use them. It looks to me like it instead just breaks backward compatibility. The new 'compression' option won't be recognized by older servers. If it were spelled '_pq_.compression' then the facilities in that commit would cause a NegotiateProtocolVersion message to be sent by servers which have that commit but don't support the compression option. I'm not exactly sure what will happen on even-older servers that don't have that commit either, but I hope they'll treat it as a GUC name; note that setting an unknown GUC name with a namespace prefix is not an error, but setting one without such a prefix IS an ERROR. Servers which do support compression would respond with a message indicating that compression had been enabled or, maybe, just start sending back compressed-packet messages, if we go with including some framing in the libpq protocol. If I tried to login with new client with _pq_.compression option to old
Re: Undocumented(?) limits on regexp functions
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> Doubt it --- we could use the "huge" request variants, maybe, but > Tom> I wonder whether the engine could run fast enough that you'd want > Tom> to. > I do wonder (albeit without evidence) whether the quadratic slowdown > problem I posted a patch for earlier was ignored for so long because > people just went "meh, regexps are slow" rather than wondering why a > trivial splitting of a 40kbyte string was taking more than a second. I have done performance measurements on the regex stuff in the past, and not noticed any huge penalty in regexp.c. I was planning to try to figure out what test case you were using that was different from what I'd looked at, but not got round to it yet. In the light of morning I'm reconsidering my initial thought of not wanting to use MemoryContextAllocHuge. My reaction was based on thinking that that would allow people to reach indefinitely large regexp inputs, but really that's not so; the maximum input length will be a 1GB text object, hence at most 1G characters. regexp.c needs to expand that into 4-bytes-each "chr" characters, so it could be at most 4GB of data. The fact that inputs between 256M and 1G characters fail could be seen as an implementation rough edge that we ought to sand down, at least on 64-bit platforms. regards, tom lane
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 08/14/2018 01:49 PM, Tomas Vondra wrote: On 08/13/2018 04:49 PM, Andres Freund wrote: Hi, On 2018-08-13 11:46:30 -0300, Alvaro Herrera wrote: On 2018-Aug-11, Tomas Vondra wrote: Hmmm, it's difficult to compare "bt full" output, but my backtraces look somewhat different (and all the backtraces I'm seeing are 100% exactly the same). Attached for comparison. Hmm, looks similar enough to me -- at the bottom you have the executor doing its thing, then an AcceptInvalidationMessages in the middle section atop which sit a few more catalog accesses, and further up from that you have another AcceptInvalidationMessages with more catalog accesses. AFAICS that's pretty much the same thing Andres was describing. It's somewhat different because it doesn't seem to involve a reload of a nailed table, which my traces did. I wasn't able to reproduce the crash more than once, so I'm not at all sure how to properly verify the issue. I'd appreciate if Thomas could try to do so again with the small patch I provided. I'll try in the evening. I've tried reproducing it on my laptop, but I can't make that happen for some reason - I know I've seen some crashes here, but all the reproducers were from the workstation I have at home. I wonder if there's some subtle difference between the two boxes, making it more likely on one of them ... the whole environment (distribution, packages, compiler, ...) should be exactly the same, though. The only thing I can think of is different CPU speed, possibly making some race conditions more/less likely. No idea. I take that back - I can reproduce the crashes, both with and without the patch, all the way back to 9.6. Attached is a bunch of backtraces from various versions. There's a bit of variability depending on which pgbench script gets started first (insert.sql or vacuum.sql) - in one case (when vacuum is started before insert) the crash happens in InitPostgres/RelationCacheInitializePhase3, otherwise it happens in exec_simple_query. Another observation is that the failing COPY is not necessary, I can reproduce the crashes without this (so even with wal_level=replica). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services crash-10.log.gz Description: application/gzip crash-11.log.gz Description: application/gzip crash-11-2.log.gz Description: application/gzip crash-11-3.log.gz Description: application/gzip crash-96.log.gz Description: application/gzip crash-96-2.log.gz Description: application/gzip crash-96-logical.log.gz Description: application/gzip
Re: [HACKERS] pgbench - allow to store select results into variables
Hello Andrew, It's not done by my MUA, and it's present in your latest posted patch. If anything I'd suspect your MUA: andrew@emma*$ curl -s https://www.postgresql.org/message-id/attachment/64237/pgbench-into-19.patch Argh. Indeed, this downloaded version has CRLF. Now when I save the attachment in my MUA, I only have LF... Let us look at the raw format: Content-Type: text/plain; name=pgbench-into-19.patch Content-Transfer-Encoding: BASE64 ... ZGlmZiAtLWdpdCBhL2RvYy9zcmMvc2dtbC9yZWYvcGdiZW5jaC5zZ21sIGIv ZG9jL3NyYy9zZ21sL3JlZi9wZ2JlbmNoLnNnbWwNCmluZGV4IDg4Y2Y4YjM5 ... Where you immediatly see that it has indeed CRLF at the end of the second line:-). So you are right, and my trusted mailer is encoding *AND* decoding silently. Why would it do that? After some googling, this is because RFC 2046 (MIME) says you "MUST": https://tools.ietf.org/html/rfc2046#section-4.1.1 So I'm right in the end, and the whole world is wrong, which is a relief:-) As I cannot except everybody to have a RFC 2046 compliant MUA, and after some meddling in "/etc/mime.types", I now have: Content-Type: application/octet-stream; name=pgbench-into-19.patch Content-Transfer-Encoding: BASE64 ... ZGlmZiAtLWdpdCBhL2RvYy9zcmMvc2dtbC9yZWYvcGdiZW5jaC5zZ21sIGIv ZG9jL3NyYy9zZ21sL3JlZi9wZ2JlbmNoLnNnbWwKaW5kZXggODhjZjhiMzkz Which is much better:-) I re-attached the v19 for a check on the list. -- Fabien. pgbench-into-19.patch Description: Binary data
Re: Memory leak with CALL to Procedure with COMMIT.
The commit that started this is commit 59a85323d9d5927a852939fa93f09d142c72c91a Author: Peter Eisentraut Date: Mon Jul 9 13:58:08 2018 Add UtilityReturnsTuples() support for CALL This ensures that prepared statements for CALL can return tuples. The change whether a statement returns tuples or not causes some different paths being taking deep in the portal code that set snapshot in different ways. I haven't fully understood them yet. I plan to post more tomorrow. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
InsertPgAttributeTuple() and attcacheoff
It seems to me that it would make sense if InsertPgAttributeTuple() were to set attcacheoff to -1 instead of taking it from the caller. InsertPgAttributeTuple() is the interface between in-memory tuple descriptors and on-disk pg_attribute, so it makes sense to give it the job of resetting attcacheoff. This avoids having all the callers having to do so. There are also pending patches that have to work around this in seemingly unnecessary ways. (The comment about the "very grotty code" that I'm removing is from a time when AppendAttributeTuples() would deform a tuple, reset attcacheoff, then reform the tuple -- hence grotty. This is long obsolete, since the tuple is now formed later.) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 9b19e5679b08d30c6fe40fd896b6a8d3dac45c24 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Tue, 17 Jul 2018 09:48:29 +0200 Subject: [PATCH] InsertPgAttributeTuple() to set attcacheoff InsertPgAttributeTuple() is the interface between in-memory tuple descriptors and on-disk pg_attribute, so it makes sense to give it the job of resetting attcacheoff. This avoids having all the callers having to do so. --- src/backend/catalog/heap.c | 9 - src/backend/catalog/index.c | 5 - src/backend/commands/tablecmds.c | 1 - 3 files changed, 4 insertions(+), 11 deletions(-) diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 4cfc0c8911..ac5a677c5f 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -592,8 +592,8 @@ CheckAttributeType(const char *attname, * Construct and insert a new tuple in pg_attribute. * * Caller has already opened and locked pg_attribute. new_attribute is the - * attribute to insert (but we ignore attacl and attoptions, which are always - * initialized to NULL). + * attribute to insert. attcacheoff is always initialized to -1, attacl and + * attoptions are always initialized to NULL. * * indstate is the index state for CatalogTupleInsertWithInfo. It can be * passed as NULL, in which case we'll fetch the necessary info. (Don't do @@ -620,7 +620,7 @@ InsertPgAttributeTuple(Relation pg_attribute_rel, values[Anum_pg_attribute_attlen - 1] = Int16GetDatum(new_attribute->attlen); values[Anum_pg_attribute_attnum - 1] = Int16GetDatum(new_attribute->attnum); values[Anum_pg_attribute_attndims - 1] = Int32GetDatum(new_attribute->attndims); - values[Anum_pg_attribute_attcacheoff - 1] = Int32GetDatum(new_attribute->attcacheoff); + values[Anum_pg_attribute_attcacheoff - 1] = Int32GetDatum(-1); values[Anum_pg_attribute_atttypmod - 1] = Int32GetDatum(new_attribute->atttypmod); values[Anum_pg_attribute_attbyval - 1] = BoolGetDatum(new_attribute->attbyval); values[Anum_pg_attribute_attstorage - 1] = CharGetDatum(new_attribute->attstorage); @@ -689,9 +689,8 @@ AddNewAttributeTuples(Oid new_rel_oid, attr = TupleDescAttr(tupdesc, i); /* Fill in the correct relation OID */ attr->attrelid = new_rel_oid; - /* Make sure these are OK, too */ + /* Make sure this is OK, too */ attr->attstattarget = -1; - attr->attcacheoff = -1; InsertPgAttributeTuple(rel, attr, indstate); diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 2dad7b059e..b256054908 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -557,12 +557,7 @@ AppendAttributeTuples(Relation indexRelation, int numatts) { Form_pg_attribute attr = TupleDescAttr(indexTupDesc, i); - /* -* There used to be very grotty code here to set these fields, but I -* think it's unnecessary. They should be set already. -*/ Assert(attr->attnum == i + 1); - Assert(attr->attcacheoff == -1); InsertPgAttributeTuple(pg_attribute, attr, indstate); } diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index f6210226e9..7cedc28c6b 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -5522,7 +5522,6 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel, attribute.atttypid = typeOid; attribute.attstattarget = (newattnum > 0) ? -1 : 0; attribute.attlen = tform->typlen; - attribute.attcacheoff = -1; attribute.atttypmod = typmod; attribute.attnum = newattnum; attribute.attbyval = tform->typbyval; -- 2.18.0
Re: Undocumented(?) limits on regexp functions
> "Tom" == Tom Lane writes: >> Should these limits: >> a) be removed Tom> Doubt it --- we could use the "huge" request variants, maybe, but Tom> I wonder whether the engine could run fast enough that you'd want Tom> to. I do wonder (albeit without evidence) whether the quadratic slowdown problem I posted a patch for earlier was ignored for so long because people just went "meh, regexps are slow" rather than wondering why a trivial splitting of a 40kbyte string was taking more than a second. -- Andrew (irc:RhodiumToad)
Re: [HACKERS] pgbench - allow to store select results into variables
> "Andrew" == Andrew Dunstan writes: >> The patch in the original email is in text/plain with base64 transfer >> encoding, which means that CRLF line endings are mandatory. It's >> actually up to the receiving MUA (or the archives webserver) to undo >> that. >> >> If the archives webserver isn't handling that then it's a bug there. Andrew> Probably a good reason not to use text/plain for patches, ISTM. Andrew> I do note that my MUA (Thunderbird) uses text/x-patch and Andrew> probably violates RFC2046 4.1.1 The first patch of yours I found was in text/x-patch with 7bit transfer encoding, so the line endings are actually those of the mail message itself (i.e. CRLF on the wire). -- Andrew (irc:RhodiumToad)
Re: [HACKERS] pgbench - allow to store select results into variables
On 08/14/2018 07:37 AM, Andrew Gierth wrote: "Andrew" == Andrew Dunstan writes: >>> This patch contains CRLF line endings >> >> Alas, not according to "file" nor "hexdump" (only 0A, no 0D) on my >> local version, AFAICS. >> >> What happens on the path and what is done by mail clients depending >> on the mime type is another question (eg text/x-diff or text/plain). Andrew> It's not done by my MUA, and it's present in your latest posted Andrew> patch. If anything I'd suspect your MUA: The patch in the original email is in text/plain with base64 transfer encoding, which means that CRLF line endings are mandatory. It's actually up to the receiving MUA (or the archives webserver) to undo that. If the archives webserver isn't handling that then it's a bug there. Probably a good reason not to use text/plain for patches, ISTM. I do note that my MUA (Thunderbird) uses text/x-patch and probably violates RFC2046 4.1.1 cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: libpq should append auth failures, not overwrite
Hello Tom, The thing is that there are a *lot* of systems nowadays on which localhost maps to both ipv4 and ipv6, so that "host=localhost" would be enough to trigger the new behavior, I think that people would survive having the ip spelled out on localhost errors when there are several ips associated to the name. You could also create an exception for "localhost" if you see it as a large problem, but if someone redefined localhost somehow (I did that once by inadvertence), it would be nice to help and be explicit. and I think that would inevitably give rise to more complaints than kudos. Hmmm. I'm not sure about what the complaint could be in case of multiple ips. "Hey, I have a 'host' with multiple ips, and on errors you tell me which ip generated an issue, and I do not want to know, really". So I'm still of the opinion that we're better off with the second patch's behavior as it stands. This was a mere suggestion. As a user, and as a support person for others on occasions, I like precise error messages which convey all relevant information. In this instance a key information is hidden, hence the proposal. Responding to your other points from upthread: There are still 86 instances of "printfPQExpBuffer", which seems like a lot. They are mostly OOM messages, but not only. This make checking the patch quite cumbersome. I'd rather there would be only one rule, and clear reset with a comment when needded (eg "fe-lobj.c"). Well, I did this for discussion's sake, but I don't really find the resulting changes in fe-lobj.c to be any sort of improvement. [...] The improvement I see is that if there would not be single remaining printf, so it is easy to check that no case were forgotten in libpq, and for future changes that everywhere in libpq there is only one rule which is "append errors to expbuf", not "it depends on the file or function you are in, please look at it in detail". It is unclear to me why the "printf" variant is used in "PQencryptPasswordConn" and "connectOptions2", it looks that you skipped these functions. I did because it seemed like unnecessary extra diff content, since we're expecting the error buffer to be initially empty anyway (for connectOptions2) or we'd just need an extra reset (for PQencryptPasswordConn). In the attached I went ahead and made those changes, but again I'm unsure that it's really a net improvement. Same as above: easier to check, one simple rule for all libpq errors. I do not like the resulting output server: problem found more details I'd rather have server: problem found more details Done in the attached, although I'm concerned about the resulting effects for error message line length --- in my tests, lines that used to reliably fit in 80 columns don't anymore. Again, I'm not really convinced this is an improvement. It would absolutely *not* be an improvement if we tried to also shoehorn the server's IP address in there; that'd make the lines way too long, with way too much stuff before the important part. I understand your concern about line length. A possible compromise would be to newline *and* indent before "problem found". To avoid messy code, there could be an internal function to manage that, eg. appendErrorContext(...), for "server:" appendError(exbuf, fmt, ...) , for errors, which would indent the initial line by two spaces. server1: problem found details server2: problem found details This would also give room for the ip on a 80-column server line. The alternative is that the committer does as they want, which is also fine with me:-) -- Fabien.
proposal: schema private functions
Hi I would to introduce new flag for routines - PRIVATE. Routines with this flag can be called only from other routines assigned with same schema. Because these routines are not available for top queries, we can hide these functions from some outputs like \df .. Example: CREATE SCHEMA s1; CREATE SCHEMA s2; CREATE OR REPLACE FUNCTION s1.nested() RETURNS int AS $$ BEGIN RETURN random()*1000; END; $$ PRIVATE; SELECT s1.nested(); -- fails - it is top query CREATE OR REPLACE FUNCTION s1.fx() RETURNS int AS $$ BEGIN RETURN s1.nested(); END; $$ LANGUAGE plpgsql; SELECT s1.fx(); -- will work CREATE OR REPLACE FUNCTION s2.fx() RETURNS int AS $$ BEGIN RETURN s1.nested(); END; $$ LANGUAGE plpgsql; SELECT s2.fx(); -- fails - it call private function from other schema. This proposal is simple, and strong enough to separate functions that can be directly callable and auxiliary functions, that can be called from other functions. I wrote PoC implementation, and it is not hard, and it should not to impact performance. I introduced query_owner_nspid into query environment. When any functions has flag private, then query_owner_nspid should be same like function namespace id. Comments, notes? Regards Pavel diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c index 246776093e..be634b87e7 100644 --- a/src/backend/catalog/pg_aggregate.c +++ b/src/backend/catalog/pg_aggregate.c @@ -625,6 +625,7 @@ AggregateCreate(const char *aggName, PROVOLATILE_IMMUTABLE, /* volatility (not needed * for agg) */ proparallel, + false,/* isPrivate, now false */ parameterTypes, /* paramTypes */ allParameterTypes, /* allParamTypes */ parameterModes, /* parameterModes */ @@ -798,6 +799,8 @@ lookup_agg_function(List *fnName, FuncDetailCode fdresult; AclResult aclresult; int i; + bool is_private; + Oid nspid; /* * func_get_detail looks up the function in the catalogs, does @@ -810,7 +813,8 @@ lookup_agg_function(List *fnName, nargs, input_types, false, false, &fnOid, rettype, &retset, &nvargs, &vatype, - &true_oid_array, NULL); + &true_oid_array, NULL, + &is_private, &nspid); /* only valid case is a normal function not returning a set */ if (fdresult != FUNCDETAIL_NORMAL || !OidIsValid(fnOid)) diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index 9b4015d0d4..4b11485442 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -79,6 +79,7 @@ ProcedureCreate(const char *procedureName, bool isStrict, char volatility, char parallel, +bool isPrivate, oidvector *parameterTypes, Datum allParameterTypes, Datum parameterModes, @@ -329,6 +330,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_pronargdefaults - 1] = UInt16GetDatum(list_length(parameterDefaults)); values[Anum_pg_proc_prorettype - 1] = ObjectIdGetDatum(returnType); values[Anum_pg_proc_proargtypes - 1] = PointerGetDatum(parameterTypes); + values[Anum_pg_proc_proisprivate - 1] = BoolGetDatum(isPrivate); if (allParameterTypes != PointerGetDatum(NULL)) values[Anum_pg_proc_proallargtypes - 1] = allParameterTypes; else diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index 68109bfda0..51907dd678 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -479,7 +479,8 @@ compute_common_attribute(ParseState *pstate, List **set_items, DefElem **cost_item, DefElem **rows_item, - DefElem **parallel_item) + DefElem **parallel_item, + DefElem **scope_item) { if (strcmp(defel->defname, "volatility") == 0) { @@ -546,6 +547,13 @@ compute_common_attribute(ParseState *pstate, *parallel_item = defel; } + else if (strcmp(defel->defname, "scope") == 0) + { + if (*scope_item) + goto duplicate_error; + + *scope_item = defel; + } else return false; @@ -655,7 +663,8 @@ compute_function_attributes(ParseState *pstate, ArrayType **proconfig, float4 *procost, float4 *prorows, - char *parallel_p) + char *parallel_p, + bool *isPrivate) { ListCell *option; DefElem*as_item = NULL; @@ -670,6 +679,7 @@ compute_function_attributes(ParseState *pstate, DefElem*cost_item = NULL; DefElem*rows_item = NULL; DefElem*parallel_item = NULL; + DefElem*scope_item = NULL; foreach(option, options) { @@ -726,7 +736,8 @@ compute_function_attributes(ParseState *pstate, &set_items, &cost_item, &rows_item, - ¶llel_item)) + ¶llel_item, + &scope_item)) { /* recognized common option */ continue; @@ -790,6 +801,11 @@ compute_function_attributes(ParseState *pstate, } if (parallel_item) *parallel_p = interpret_func_parallel(parallel_item);
Re: proposal: schema private functions
2018-08-14 13:56 GMT+02:00 Pavel Stehule : > Hi > > I would to introduce new flag for routines - PRIVATE. Routines with this > flag can be called only from other routines assigned with same schema. > Because these routines are not available for top queries, we can hide these > functions from some outputs like \df .. > It can be used for code organization purposes, and can be used for implementation some security patterns. The private functions cannot be directly called by user who had not CREATE right on schema. But these functions can be evaluated under any user who has a access to schema. Regards Pavel > Example: > > CREATE SCHEMA s1; > CREATE SCHEMA s2; > > CREATE OR REPLACE FUNCTION s1.nested() > RETURNS int AS $$ > BEGIN > RETURN random()*1000; > END; > $$ PRIVATE; > > SELECT s1.nested(); -- fails - it is top query > > CREATE OR REPLACE FUNCTION s1.fx() > RETURNS int AS $$ > BEGIN > RETURN s1.nested(); > END; > $$ LANGUAGE plpgsql; > > SELECT s1.fx(); -- will work > > CREATE OR REPLACE FUNCTION s2.fx() > RETURNS int AS $$ > BEGIN > RETURN s1.nested(); > END; > $$ LANGUAGE plpgsql; > > SELECT s2.fx(); -- fails - it call private function from other schema. > > This proposal is simple, and strong enough to separate functions that can > be directly callable and auxiliary functions, that can be called from other > functions. > > I wrote PoC implementation, and it is not hard, and it should not to > impact performance. I introduced query_owner_nspid into query environment. > When any functions has flag private, then query_owner_nspid should be same > like function namespace id. > > Comments, notes? > > Regards > > Pavel > >
Re: [HACKERS] pgbench - allow to store select results into variables
Andrew> It's not done by my MUA, and it's present in your latest posted Andrew> patch. If anything I'd suspect your MUA: The patch in the original email is in text/plain with base64 transfer encoding, which means that CRLF line endings are mandatory. It's actually up to the receiving MUA (or the archives webserver) to undo that. I came to the same conclusion. This is hidden because most people post patches as "application/octet-stream", where no meddling is allowed. I'll try to do that in the future. If the archives webserver isn't handling that then it's a bug there. I'm not sure that the webserver is at fault either: it sends "CRLF" on "text/plain", which seems okay, even required, by MIME. Maybe the web user agent should do the translating if appropriate to the receiving system... Obviously "curl" does not do it, nor "firefox" on "save as". I have no doubt that some MUA around would forget to do the conversion. -- Fabien.
Re: logical decoding / rewrite map vs. maxAllocatedDescs
On 08/13/2018 04:49 PM, Andres Freund wrote: Hi, On 2018-08-13 11:46:30 -0300, Alvaro Herrera wrote: On 2018-Aug-11, Tomas Vondra wrote: Hmmm, it's difficult to compare "bt full" output, but my backtraces look somewhat different (and all the backtraces I'm seeing are 100% exactly the same). Attached for comparison. Hmm, looks similar enough to me -- at the bottom you have the executor doing its thing, then an AcceptInvalidationMessages in the middle section atop which sit a few more catalog accesses, and further up from that you have another AcceptInvalidationMessages with more catalog accesses. AFAICS that's pretty much the same thing Andres was describing. It's somewhat different because it doesn't seem to involve a reload of a nailed table, which my traces did. I wasn't able to reproduce the crash more than once, so I'm not at all sure how to properly verify the issue. I'd appreciate if Thomas could try to do so again with the small patch I provided. I'll try in the evening. I've tried reproducing it on my laptop, but I can't make that happen for some reason - I know I've seen some crashes here, but all the reproducers were from the workstation I have at home. I wonder if there's some subtle difference between the two boxes, making it more likely on one of them ... the whole environment (distribution, packages, compiler, ...) should be exactly the same, though. The only thing I can think of is different CPU speed, possibly making some race conditions more/less likely. No idea. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/09 22:04), Etsuro Fujita wrote: (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: I choosed to expand tuple descriptor for junk column added to foreign relaions. We might be better to have new member in ForeignScan but I didn't so that we can backpatch it. I've not looked at the patch closely yet, but I'm not sure that it's a good idea to expand the tuple descriptor of the target relation on the fly so that it contains the remotetableoid as a non-system attribute of the target table. My concern is: is there not any risk in affecting some other part of the planner and/or the executor? (I was a bit surprised that the patch passes the regression tests successfully.) I spent more time looking at the patch. ISTM that the patch well suppresses the effect of the tuple-descriptor expansion by making changes to code in the planner and executor (and ruleutils.c), but I'm still not sure that the patch is the right direction to go in, because ISTM that expanding the tuple descriptor on the fly might be a wart. What the patch does are: - This abuses ForeignScan->fdw_scan_tlist to store the additional junk columns when foreign simple relation scan (that is, not a join). I think that this issue was introduced in 9.3, which added postgres_fdw in combination with support for writable foreign tables, but fdw_scan_tlist was added to 9.5 as part of join-pushdown infrastructure, so my concern is that we might not be able to backpatch your patch to 9.3 and 9.4. Another concern about this: You wrote: >Several places seems to be assuming that fdw_scan_tlist may be >used foreign scan on simple relation but I didn't find that >actually happens. Yeah, currently, postgres_fdw and file_fdw don't use that list for simple foreign table scans, but it could be used to improve the efficiency for those scans, as explained in fdwhandler.sgml: Another ForeignScan field that can be filled by FDWs is fdw_scan_tlist, which describes the tuples returned by the FDW for this plan node. For simple foreign table scans this can be set to NIL, implying that the returned tuples have the row type declared for the foreign table. A non-NIL value must be a target list (list of TargetEntrys) containing Vars and/or expressions representing the returned columns. This might be used, for example, to show that the FDW has omitted some columns that it noticed won't be needed for the query. Also, if the FDW can compute expressions used by the query more cheaply than can be done locally, it could add those expressions to fdw_scan_tlist. Note that join plans (created from paths made by GetForeignJoinPaths) must always supply fdw_scan_tlist to describe the set of columns they will return. You wrote: > I'm not sure whether the following ponits are valid. > > - If fdw_scan_tlist is used for simple relation scans, this would >break the case. (ExecInitForeignScan, set_foreignscan_references) Some FDWs might already use that list for the improved efficiency for simple foreign table scans as explained above, so we should avoid breaking that. If we take the Param-based approach suggested by Tom, I suspect there would be no need to worry about at least those things, so I'll try to update your patch as such, if there are no objections from you (or anyone else). Best regards, Etsuro Fujita
Re: [HACKERS] pgbench - allow to store select results into variables
> "Andrew" == Andrew Dunstan writes: >>> This patch contains CRLF line endings >> >> Alas, not according to "file" nor "hexdump" (only 0A, no 0D) on my >> local version, AFAICS. >> >> What happens on the path and what is done by mail clients depending >> on the mime type is another question (eg text/x-diff or text/plain). Andrew> It's not done by my MUA, and it's present in your latest posted Andrew> patch. If anything I'd suspect your MUA: The patch in the original email is in text/plain with base64 transfer encoding, which means that CRLF line endings are mandatory. It's actually up to the receiving MUA (or the archives webserver) to undo that. If the archives webserver isn't handling that then it's a bug there. -- Andrew (irc:RhodiumToad)
Re: Uncaught PHP Exception Doctrine\DBAL\Exception\UniqueConstraintViolationException: "An exception occurred while executing 'UPDATE
On 08/14/2018 05:58 AM, Jarosław Torbicki wrote: Hello, I used PostgreSQL 9.3 but I executed upgrade few days ago. Now, I am using 10.4 PostgreSQL and: doctrine/annotations v1.2.7 doctrine/cache v1.4.2 doctrine/collections v1.3.0 doctrine/common v2.7.3 doctrine/dbal v2.5.13 doctrine/doctrine-bundle v1.5.2 doctrine/doctrine-cache-bundle v1.0.1 doctrine/inflector v1.0.1 doctrine/instantiator 1.0.5 doctrine/lexer v1.0.1 doctrine/orm v2.5.14 I have a problem with ManyToOne relation. For example, I have main object with three child and when I execute on main object $em = $this->getDoctrine()->getManager(); $em->merge($data); $em->flush(); I sometimes get ERROR message like: /Uncaught PHP Exception Doctrine\DBAL\Exception\UniqueConstraintViolationException: "An exception occurred while executing 'UPDATE/ I get this ERRROR message not for all main object and not for all child. For example, first update child object is ok but in second I get error. SQL prepared by doctrine: UPDATE child_table SET id = ?, name = ?, object_name = ?, object_size = ? WHERE id = ?' with params ["2", "test Name object 2", "test name object 2", "1234", 3] In this sql the doctrine tries update object with id=3 using data from object with id = 2. This problem didn’t occur before executing upgrade to 10.4 version. Can you help me and give some tips? This mailing list is about development of PostgreSQL, not about how to use it. Please ask in the correct forum (possibly the pgsql-general mailing list) cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] pgbench - allow to store select results into variables
On 08/13/2018 06:30 PM, Fabien COELHO wrote: Hello Andrew, Attached is v18, another basic rebase after some perl automatic reindentation. This patch contains CRLF line endings Alas, not according to "file" nor "hexdump" (only 0A, no 0D) on my local version, AFAICS. What happens on the path and what is done by mail clients depending on the mime type is another question (eg text/x-diff or text/plain). It's not done by my MUA, and it's present in your latest posted patch. If anything I'd suspect your MUA: andrew@emma*$ curl -s https://www.postgresql.org/message-id/attachment/64237/pgbench-into-19.patch | head -n 3 | od -c 000 d i f f - - g i t a / d o c 020 / s r c / s g m l / r e f / p g 040 b e n c h . s g m l b / d o c 060 / s r c / s g m l / r e f / p g 100 b e n c h . s g m l \r \n i n d e 120 x 8 8 c f 8 b 3 9 3 3 . . 9 4 140 6 f 0 8 0 0 5 d 1 0 0 6 4 4 \r 160 \n - - - a / d o c / s r c / s 200 g m l / r e f / p g b e n c h . 220 s g m l \r \n The gzipped version you also attached did not have the CRs. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Uncaught PHP Exception Doctrine\DBAL\Exception\UniqueConstraintViolationException: "An exception occurred while executing 'UPDATE
Hello, I used PostgreSQL 9.3 but I executed upgrade few days ago. Now, I am using 10.4 PostgreSQL and: doctrine/annotations v1.2.7 doctrine/cache v1.4.2 doctrine/collections v1.3.0 doctrine/common v2.7.3 doctrine/dbalv2.5.13 doctrine/doctrine-bundle v1.5.2 doctrine/doctrine-cache-bundle v1.0.1 doctrine/inflector v1.0.1 doctrine/instantiator1.0.5 doctrine/lexer v1.0.1 doctrine/orm v2.5.14 I have a problem with ManyToOne relation. For example, I have main object with three child and when I execute on main object $em = $this->getDoctrine()->getManager(); $em->merge($data); $em->flush(); I sometimes get ERROR message like: Uncaught PHP Exception Doctrine\DBAL\Exception\UniqueConstraintViolationException: "An exception occurred while executing 'UPDATE I get this ERRROR message not for all main object and not for all child. For example, first update child object is ok but in second I get error. SQL prepared by doctrine: UPDATE child_table SET id = ?, name = ?, object_name = ?, object_size = ? WHERE id = ?' with params ["2", "test Name object 2", "test name object 2", "1234", 3] In this sql the doctrine tries update object with id=3 using data from object with id = 2. This problem didn't occur before executing upgrade to 10.4 version. Can you help me and give some tips? Pozdrawiam, __ Jarosław Torbicki Analityk
Re: NetBSD vs libxml2
My opinion (if it's worth anything) is that a binary should not specify search paths for libraries other than those explicitly provided with it as part of its own package. The environment should handle shared library paths, using $LD_PATH or ldconfig or whatever. If a binary has to specify where is a third-party shared library that it wishes the dynamic loader to use then the environment is not set up correctly. Geoff
Re: libpq should not look up all host addresses at once
Hello Tom, As you noted in another message, a small doc update should be needed. Check. Proposed doc patch attached. (Only the last hunk is actually specific to this patch, the rest is cleanup that I noticed while looking around for possibly-relevant text.) Doc build is ok. Some comments that you may not find all useful, please accept my apology, it just really shows that I read your prose in some detail:-) The mention of possible reverse dns queries has been removed... but I do not think there was any before? There could be if only hostaddr is provided but a hostname is required by some auth, but it does not seem to be the case according to the documentation. I read the rational of the host/hostaddr artificial mapping. I cannot say I'm thrilled with the result: I do not really see a setting where avoiding a DNS query is required but which still needs a hostname for auth... If you have GSSAPI or SSPI then you have an underlying network, in which a dns query should be fine. Moreover the feature implementation is misleading as hostaddr changes the behavior of host when both are provided. Also, "hosts" accepts ips anyway (although is not documented). STM that we should have had only "host" for specifying the target (name, ip, path), and if really necessary an ugly "authname" directive to provide names on the side. I know, too late. I think that in the string format host=foo:5433,bla:5432 could be accepted as it is in the URL format. Some of the changes are not directly related to the multi host feature, and should/could be backpatched further? I'd consider wrapping some of the logic. I'd check the port first, then move the host resolution stuff into a function. Don't really see the value of either ... What I see is that the host logic is rather lengthy and specific (it depends on the connection type -- name, ip, socket including a so nice #ifdef), distinct from the port stuff which is dealt with quite quickcly. By separating the port & host and putting the lengthy stuff in a function the scope of the for loop on potential connections would be easier to read and understand, and would not require to see CHT_ cases to understand what is being done for managing "host" variants, which is a mere detail. -- Fabien.
Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept
On Wed, Feb 28, 2018 at 11:24 PM, Ivan Kartyshov wrote: > Thank you for your valuable comments. I've made a few adjustments. > Thank you for working on this! > The main goal of my changes is to let long read-only transactions run on > replica if hot_standby_feedback is turned on. FWIW the idea proposed on the thread[1], which allows us to disable the heap truncation by vacuum per tables, might help this issue. Since the original problem on that thread is a performance problem an alternative proposal would be better, but I think the way would be helpful for this issue too and is simple. A downside of that idea is that there is additional work for user to configure the reloption to each tables. [1] https://www.postgresql.org/message-id/CAHGQGwE5UqFqSq1%3DkV3QtTUtXphTdyHA-8rAj4A%3DY%2Be4kyp3BQ%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: libpq connection timeout mismanagement
Hello Tom, connect_timeout=2.9 is accepted and considered as meaning 2. connect_timeout=-10 or connect_timeout=two are also accepted and mean forever. Probably thanks to "atoi". Right. As before, I'm not excited about rejecting trailing junk, considering we never did before. My 0.02€: accepting trailing junk is just a recipee for hiding bugs: sh> psql "connect_timeout=2,port=5433" psql> \conninfo ... port=5432 # port directive was silently ignored So erroring out would be a good thing, and it could be done on a new release. At the minimum there should be a warning. -- Fabien.
Re: Alter index rename concurrently to
Hi, On 2018-08-14 08:44:46 +0200, Peter Eisentraut wrote: > On 03/08/2018 15:00, Robert Haas wrote: > > On Thu, Aug 2, 2018 at 4:44 PM, Andres Freund wrote: > >> ISTM, if you want to increase consistency in this area, you've to go > >> further. E.g. processing invalidations in StartTransactionCommand() in > >> all states, which'd give you a lot more consistency. > > > > Hmm, that seems like a pretty good idea. > > That would only affect top-level commands, not things like SPI. Is that > what we want? Or we could sprinkle additional > AcceptInvalidationMessages() calls in spi.c. I'd say it's not unreasonable to *not* to guarantee spi (and various other functions) immediately take concurrent activity into account, unless locking rules guarantee that independently. Definining it as "commands sent by the client see effects of previously issued non-conflicting DDL, others might see them earlier" doesn't seem insane. If you want to take account of SPI and PLs it gets more comoplicated quickly, because we don't always parse functions afresh. So you'd need invalidations in a lot more places. Greetings, Andres Freund