Re: parallel append vs. simple UNION ALL
On Thu, Mar 8, 2018 at 12:27 AM, Robert Haas wrote: > New patches attached, fixing all 3 of the issues you reported: > Thanks. new patches applied cleanly on head and fixing all reported issue. Thanks & Regards, Rajkumar Raghuwanshi QMG, EnterpriseDB Corporation
Re: [HACKERS] Partition-wise aggregation/grouping
On Wed, Mar 7, 2018 at 8:07 PM, Jeevan Chalke wrote: Here are some more review comments esp. on try_partitionwise_grouping() function. BTW name of that function doesn't go in sync with enable_partitionwise_aggregation (which btw is in sync with enable_fooagg GUCs). But it goes in sync with create_grouping_paths(). It looks like we have already confused aggregation and grouping e.g. enable_hashagg may affect path creation when there is no aggregation involved i.e. only grouping is involved but create_grouping_paths will create paths when there is no grouping involved. Generally it looks like the function names use grouping to mean both aggregation and grouping but GUCs use aggregation to mean both of those. So, the naming in this patch looks consistent with the current naming conventions. +grouped_rel->part_scheme = input_rel->part_scheme; +grouped_rel->nparts = nparts; +grouped_rel->boundinfo = input_rel->boundinfo; +grouped_rel->part_rels = part_rels; You need to set the part_exprs which will provide partition keys for this partitioned relation. I think, we should include all the part_exprs of input_rel which are part of GROUP BY clause. Since any other expressions in part_exprs are not part of GROUP BY clause, they can not appear in the targetlist without an aggregate on top. So they can't be part of the partition keys of the grouped relation. In create_grouping_paths() we fetch both partial as well as fully grouped rel for given input relation. But in case of partial aggregation, we don't need fully grouped rel since we are not computing full aggregates for the children. Since fetch_upper_rel() creates a relation when one doesn't exist, we are unnecessarily creating fully grouped rels in this case. For thousands of partitions that's a lot of memory wasted. I see a similar issue with create_grouping_paths() when we are computing only full aggregates (either because partial aggregation is not possible or because parallelism is not possible). In that case, we unconditionally create partially grouped rels. That too would waste a lot of memory. I think that partially_grouped_rel, when required, is partitioned irrespective of whether we do full aggregation per partition or not. So, if we have its part_rels and other partitioning properties set. I agree that right now we won't use this information anywhere. It may be useful, in case we device a way to use partially_grouped_rel directly without using grouped_rel for planning beyond grouping, which seems unlikely. + +/* + * Parallel aggregation requires partial target, so compute it here + * and translate all vars. For partial aggregation, we need it + * anyways. + */ +partial_target = make_partial_grouping_target(root, target, + havingQual); Don't we have this available in partially_grouped_rel? That shows one asymmetry that Robert's refactoring has introduced. We don't set reltarget of grouped_rel but set reltarget of partially_grouped_rel. If reltarget of grouped_rel changes across paths (the reason why we have not set it in grouped_rel), shouldn't reltarget of partially grouped paths change accordingly? + +/* + * group_by_has_partkey + * + * Returns true, if all the partition keys of the given relation are part of + * the GROUP BY clauses, false otherwise. + */ +static bool +group_by_has_partkey(RelOptInfo *input_rel, PathTarget *target, + List *groupClause) We could modify this function to return the list of part_exprs which are part of group clause and use that as the partition keys of the grouped_rel if required. If group by doesn't have all the partition keys, the function would return a NULL list. Right now, in case of full aggregation, partially_grouped_rel is populated with the partial paths created by adding partial aggregation to the partial paths of input relation. But we are not trying to create partial paths by (parallel) appending the (non)partial paths from the child partially_grouped_rel. Have we thought about that? Would such paths have different shapes from the ones that we create now and will they be better? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: csv format for psql
2018-03-07 22:16 GMT+01:00 David Fetter : > On Wed, Mar 07, 2018 at 09:37:26PM +0100, Pavel Stehule wrote: > > 2018-03-07 21:31 GMT+01:00 Daniel Verite : > > > > > David Fetter wrote: > > > > > > > We have some inconsistency here in that fewer table formats are > > > > supported, but I think asciidoc, etc., do this correctly via > > > > invocations like: > > > > > > > >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo' > > > > > > -A is equivalent to -P format=unaligned, so in the above > > > invocation, it cancels the effect of -P format=asciidoc. > > > Anyway -P format=name on the command line > > > is the same as "\pset format name" as a > > > metacommand, so it works for all formats. > > > > > > Some formats (unaligned, html) have corresponding > > > command-line options (-A, -H), and others don't. > > > In this patch, -C is used so that csv would be in the > > > category of formats that can be switched on with the simpler > > > invocation on the command line. > > > If we don't like that, we can leave out -C for future use > > > and let users write -P format=csv. > > > That's not the best choice from my POV though, as csv > > > is a primary choice to export tabular data. > > > > > > > -C can be used for certificates or some similar. I like csv, but I am not > > sure, so it is too important to get short option (the list of free chars > > will be only shorter) > > +1 for not using up a single-letter option for this. > Is there some rule, so alone long options are disallowed? When this software will be more mature, then we cannot to find "inteligent" short option for lot of tasks. Regards Pavel > > Best, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate >
Re: INOUT parameters in procedures
Hi 2018-03-08 1:53 GMT+01:00 Peter Eisentraut : > On 3/6/18 04:22, Pavel Stehule wrote: > > why just OUT variables are disallowed? > > > > The oracle initializes these values to NULL - we can do same? > > The problem is function call resolution. If we see a call like > > CALL foo(a, b, c); > > the this could be foo() with zero input and three output parameters, or > with one input parameter and two output parameters, etc. We have no > code to deal with that right now. > It looks like some error in this concept. The rules for enabling overwriting procedures should modified, so this collision should not be done. When I using procedure from PL/pgSQL, then it is clear, so I place on *OUT position variables. But when I call procedure from top, then I'll pass fake parameters to get some result. CREATE OR REPLACE PROCEDURE proc(IN a, OUT x, OUT y) AS $$ BEGIN x := a * 10; y := a + 10; END; $$ LANGUAGE plpgsql; CALL proc(10) -- has sense but because just OUT variables are not possible, then the definition must be changed to CREATE OR REPLACE PROCEDURE proc(IN a, INOUT x, INOUT y) and CALL proc(10, NULL, NULL) -- looks little bit scarry I understand so this is not easy solution (and it can be topic for other releases), but I am thinking so it is solvable - but needs deeper change in part, where is a routine is selected on signature. Now, this algorithm doesn't calculate with OUT params. This enhancing can be interesting for some purposes (and again it can helps with migration from Oracle - although these techniques are usually used inside system libraries): a) taking more info from proc when it is required PROCEDURE foo(a int); PROCEDURE foo(a int, OUT detail text) b) possible to directly specify expected result type PROCEDURE from_json(a json, OUT int); PROCEDURE from_json(a json, OUT date); PROCEDURE from_json(a json, OUT text); It is clear, so in environments when variables are not available, these procedures cannot be called doe possible ambiguity. This point can be closed now, I accept technical limits. > > > Minimally this message is not too friendly, there should be hint - "only > > INOUT is suported" - but better support OUT too - from TOP OUT variables > > should not be passed. from PL should be required. > > Added a hint. > ok > > > I wrote recursive procedure. The call finished by exception. Why? > > Fixed. (memory context issue) > tested, it is ok now > > I added your example as a test case. > > > This issue can be detected in compile time, maybe? > > > > postgres=# create or replace procedure p(x int,inout a int, inout b > numeric) > > as $$ > > begin raise notice 'xxx % %', a, b;if (x > 1) then > > a := x / 10; > > b := x / 2; call p(b::int, a, 10); <--- can be detected in compile > time? > > end if; > > end; > > $$ language plpgsql; > > Function resolution doesn't happen at compile time. That would require > significant work in PL/pgSQL (possible perhaps, but major work). Right > now, we do parse analysis at first execution. > ok, understand looks well all test passed, code is well commented, there are tests if (argmodes && (argmodes[i] == PROARGMODE_INOUT || argmodes[i] == PROARGMODE_OUT)) + { + Param *param; Because PROARGMODE_OUT are disallowed, then this check is little bit messy. Please, add some comment. Regards Pavel > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On Thu, Mar 8, 2018 at 12:31 PM, Amit Kapila wrote: > On Thu, Mar 8, 2018 at 11:04 AM, Pavan Deolasee > wrote: > > > > On Tue, Feb 13, 2018 at 12:41 PM, amul sul wrote: > >> > >> Thanks for the confirmation, updated patch attached. > >> > > > > I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch > does not > > do anything to deal with the fact that t_ctid may no longer point to > itself > > to mark end of the chain. I just can't see how that would work. > > > > I think it is not that patch doesn't care about the end of the chain. > For example, ctid pointing to itself is used to ensure that for > deleted rows, nothing more needs to be done like below check in the > ExecUpdate/ExecDelete code path. > Yeah, but it only looks for places where it needs to detect deleted tuples and thus wants to throw an error. I am worried about other places where it is assumed that the ctid points to a valid looking tid, self or otherwise. I see no such places being either updated or commented. Now may be there is no danger because of other protections in place, but it looks hazardous. > > I have not tested, but it seems this could be problematic, but I feel > we could deal with such cases by checking invalid block id in the > above if check. Another one such case is in EvalPlanQualFetch > > Right. > > > What happens if a partition key update deletes a row, but the operation > is > > aborted? Do we need any special handling for that case? > > > > If the transaction is aborted than future updates would update the > ctid to a new row, do you see any problem with it? > I don't know. May be there is none. But it needs to explained why it's not a problem. > > > I am actually worried that we're tinkering with ip_blkid to handle one > > corner case of detecting partition key update. This is going to change > > on-disk format and probably need more careful attention. Are we certain > that > > we would never require update-chain following when partition keys are > > updated? > > > > I think we should never need update-chain following when the row is > moved from one partition to another partition, otherwise, we don't > change anything on the tuple. > I am not sure I follow. I understand that it's probably a tough problem to follow update chain from one partition to another. But why do you think we would never need that? What if someone wants to improve on the restriction this patch is imposing and actually implement partition key UPDATEs the way we do for regular tables i.e. instead of throwing error, we actually update/delete the row in the new partition? > > > If so, can we think about some other mechanism which actually even > > leaves behind ? I am not saying we should do > that, > > but it warrants a thought. > > > > Oh, this would much bigger disk-format change and need much more > thoughts, where will we store new partition information. > Yeah, but the disk format will change probably change just once. Or may be this can be done local to a partition table without requiring any disk format changes? Like adding a nullable hidden column in each partition to store the forward pointer? Thanks, Pavan
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On 8 March 2018 at 12:34, Amit Kapila wrote: > On Thu, Mar 8, 2018 at 11:57 AM, Amit Khandekar > wrote: >> On 8 March 2018 at 09:15, Pavan Deolasee wrote: >>> For example, with your patches applied: >>> >>> CREATE TABLE pa_target (key integer, val text) >>> PARTITION BY LIST (key); >>> CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1); >>> CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2); >>> INSERT INTO pa_target VALUES (1, 'initial1'); >>> >>> session1: >>> BEGIN; >>> UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1; >>> UPDATE 1 >>> postgres=# SELECT * FROM pa_target ; >>> key | val >>> -+- >>>1 | initial1 updated by update1 >>> (1 row) >>> >>> session2: >>> UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE >>> key = 1 >>> >>> >>> session1: >>> postgres=# COMMIT; >>> COMMIT >>> >>> >>> >>> postgres=# SELECT * FROM pa_target ; >>> key | val >>> -+- >>>2 | initial1 updated by update2 >>> (1 row) >>> >>> Ouch. The committed updates by session1 are overwritten by session2. This >>> clearly violates the rules that rest of the system obeys and is not >>> acceptable IMHO. >>> >>> Clearly, ExecUpdate() while moving rows between partitions is missing out on >>> re-constructing the to-be-updated tuple, based on the latest tuple in the >>> update chain. Instead, it's simply deleting the latest tuple and inserting a >>> new tuple in the new partition based on the old tuple. That's simply wrong. >> >> You are right. This need to be fixed. This is a different issue than >> the particular one that is being worked upon in this thread, and both >> these issues have different fixes. >> > > I also think that this is a bug in the original patch and won't be > directly related to the patch being discussed. Yes. Will submit a patch for this in a separate thread. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Re: PATCH: psql tab completion for SELECT
Some updates on patch status: - "version-dependent-tab-completion-1.patch" by Tom Lane was committed in 722408bcd. - "psql-tab-completion-savepoint-v1.patch" by Edmund Horner is probably not needed. - "psql-select-tab-completion-v6.patch" (the latest) is still under development/review.
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On Thu, Mar 8, 2018 at 11:57 AM, Amit Khandekar wrote: > On 8 March 2018 at 09:15, Pavan Deolasee wrote: >> For example, with your patches applied: >> >> CREATE TABLE pa_target (key integer, val text) >> PARTITION BY LIST (key); >> CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1); >> CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2); >> INSERT INTO pa_target VALUES (1, 'initial1'); >> >> session1: >> BEGIN; >> UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1; >> UPDATE 1 >> postgres=# SELECT * FROM pa_target ; >> key | val >> -+- >>1 | initial1 updated by update1 >> (1 row) >> >> session2: >> UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE >> key = 1 >> >> >> session1: >> postgres=# COMMIT; >> COMMIT >> >> >> >> postgres=# SELECT * FROM pa_target ; >> key | val >> -+- >>2 | initial1 updated by update2 >> (1 row) >> >> Ouch. The committed updates by session1 are overwritten by session2. This >> clearly violates the rules that rest of the system obeys and is not >> acceptable IMHO. >> >> Clearly, ExecUpdate() while moving rows between partitions is missing out on >> re-constructing the to-be-updated tuple, based on the latest tuple in the >> update chain. Instead, it's simply deleting the latest tuple and inserting a >> new tuple in the new partition based on the old tuple. That's simply wrong. > > You are right. This need to be fixed. This is a different issue than > the particular one that is being worked upon in this thread, and both > these issues have different fixes. > I also think that this is a bug in the original patch and won't be directly related to the patch being discussed. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: RFC: Add 'taint' field to pg_control.
On 8 March 2018 at 12:34, Tom Lane wrote: > Craig Ringer writes: > > As I understand it, because we allow multiple Pg instances on a system, > we > > identify the small sysv shmem segment we use by the postmaster's pid. If > > you remove the DirLockFile (postmaster.pid) you remove the interlock > > against starting a new postmaster. It'll think it's a new independent > > instance on the same host, make a new shmem segment and go merrily on its > > way mangling data horribly. > > Yeah. If we realized that the old shmem segment was associated with this > data directory, we could check for processes still attached to it ... but > the lock file is exactly where that association is kept. > > > It'd be nice if the OS offered us some support here. Something like > opening > > a lockfile in exclusive lock mode, then inheriting the FD and lock on all > > children, with each child inheriting the lock. So the exclusive lock > > wouldn't get released until all FDs associated with it are released. But > > AFAIK nothing like that is present, let alone portable. > > I've wondered about that too. An inheritable lock on the data directory > itself would be ideal, but that doesn't exist anywhere AFAIK. We could > imagine taking a BSD-style-flock(2) lock on some lock file, on systems that > have that; but not all do, so it couldn't be our only protection. Another > problem is that since it'd have to be an ordinary file, there'd still be a > risk of cowboy DBAs removing the lock file. Maybe we could use pg_control > as the lock file? > "The error mentioned that there was another active database using this pg_control file, so I deleted it, and now my database won't start." Like our discussion about pg_resetxlog, there's a limit to how much operator error we can guard against. More importantly we'd have to be _very_ sure it was correct or our attempts to protect the user would risk creating a major outage. That said, flock(2) with LOCK_NB seems to have just the semantics we want: " A single file may not simultaneously have both shared and exclusive locks. Locks created by flock() are associated with an open file description (see open(2)). This means that duplicate file descriptors (created by, for example, fork(2) or dup(2)) refer to the same lock, and this lock may be modified or released using any of these file descriptors. Furthermore, the lock is released either by an explicit LOCK_UN operation on any of these duplicate file descriptors, or when all such file descriptors have been closed. " Worth looking into, but really important not to make a mistake and make things worse. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On Thu, Mar 8, 2018 at 11:04 AM, Pavan Deolasee wrote: > > On Tue, Feb 13, 2018 at 12:41 PM, amul sul wrote: >> >> Thanks for the confirmation, updated patch attached. >> > > I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch does not > do anything to deal with the fact that t_ctid may no longer point to itself > to mark end of the chain. I just can't see how that would work. > I think it is not that patch doesn't care about the end of the chain. For example, ctid pointing to itself is used to ensure that for deleted rows, nothing more needs to be done like below check in the ExecUpdate/ExecDelete code path. if (!ItemPointerEquals(tupleid, &hufd.ctid)) { .. } .. It will deal with such cases by checking invalidblockid before these checks. So, we should be fine in such cases. > But if it > does, it needs good amount of comments explaining why and most likely > updating comments at other places where chain following is done. For > example, how's this code in heap_get_latest_tid() is still valid? Aren't we > setting "ctid" to some invalid value here? > > 2302 /* > 2303 * If there's a valid t_ctid link, follow it, else we're done. > 2304 */ > 2305 if ((tp.t_data->t_infomask & HEAP_XMAX_INVALID) || > 2306 HeapTupleHeaderIsOnlyLocked(tp.t_data) || > 2307 ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid)) > 2308 { > 2309 UnlockReleaseBuffer(buffer); > 2310 break; > 2311 } > 2312 > 2313 ctid = tp.t_data->t_ctid; > I have not tested, but it seems this could be problematic, but I feel we could deal with such cases by checking invalid block id in the above if check. Another one such case is in EvalPlanQualFetch > This is just one example. I am almost certain there are many such cases that > will require careful attention. > Right, I think we should be able to detect and fix such cases. > What happens if a partition key update deletes a row, but the operation is > aborted? Do we need any special handling for that case? > If the transaction is aborted than future updates would update the ctid to a new row, do you see any problem with it? > I am actually worried that we're tinkering with ip_blkid to handle one > corner case of detecting partition key update. This is going to change > on-disk format and probably need more careful attention. Are we certain that > we would never require update-chain following when partition keys are > updated? > I think we should never need update-chain following when the row is moved from one partition to another partition, otherwise, we don't change anything on the tuple. > If so, can we think about some other mechanism which actually even > leaves behind ? I am not saying we should do that, > but it warrants a thought. > Oh, this would much bigger disk-format change and need much more thoughts, where will we store new partition information. > May be it was discussed somewhere else and ruled > out. There were a couple of other options discussed in the original thread "UPDATE of partition key". One of them was to have an additional bit on the tuple, but we found reusing ctid a better approach. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On 8 March 2018 at 09:15, Pavan Deolasee wrote: > For example, with your patches applied: > > CREATE TABLE pa_target (key integer, val text) > PARTITION BY LIST (key); > CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1); > CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2); > INSERT INTO pa_target VALUES (1, 'initial1'); > > session1: > BEGIN; > UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1; > UPDATE 1 > postgres=# SELECT * FROM pa_target ; > key | val > -+- >1 | initial1 updated by update1 > (1 row) > > session2: > UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE > key = 1 > > > session1: > postgres=# COMMIT; > COMMIT > > > > postgres=# SELECT * FROM pa_target ; > key | val > -+- >2 | initial1 updated by update2 > (1 row) > > Ouch. The committed updates by session1 are overwritten by session2. This > clearly violates the rules that rest of the system obeys and is not > acceptable IMHO. > > Clearly, ExecUpdate() while moving rows between partitions is missing out on > re-constructing the to-be-updated tuple, based on the latest tuple in the > update chain. Instead, it's simply deleting the latest tuple and inserting a > new tuple in the new partition based on the old tuple. That's simply wrong. You are right. This need to be fixed. This is a different issue than the particular one that is being worked upon in this thread, and both these issues have different fixes. Like you said, the tuple needs to be reconstructed when ExecDelete() finds that the row has been updated by another transaction. We should send back this information from ExecDelete() (I think tupleid parameter gets updated in this case), and then in ExecUpdate() we should goto lreplace, so that the the row is fetched back similar to how it happens when heap_update() knows that the tuple was updated. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Re: ALTER TABLE ADD COLUMN fast default
On Tue, Mar 6, 2018 at 8:15 PM, David Rowley wrote: > On 6 March 2018 at 22:40, David Rowley wrote: >> Okay, it looks like the patch should disable physical tlists when >> there's a missing column the same way as we do for dropped columns. >> Patch attached. > > Please disregard the previous patch in favour of the attached. > OK, nice work, thanks. We're making good progress. Two of the cases I was testing have gone from worse than master to better than master. Here are the results I got, all against Tomas' 100-col 64-row table. using pgbench -T 100: select sum(c1000) from t; fastdef tps = 4724.988619 master tps = 1590.843085 select c1000 from t; fastdef tps = 5093.667203 master tps = 2437.613368 select sum(c1000) from (select c1000 from t offset 0) x; fastdef tps = 3315.900091 master tps = 2067.574581 select * from t; fastdef tps = 107.145811 master tps = 150.207957 select sum(c1), count(c500), avg(c1000) from t; fastdef tps = 2304.636410 master tps = 1409.791975 select sum(c10) from t; fastdef tps = 4332.625917 master tps = 2208.757119 "select * from t" used to be about a wash, but with this patch it's got worse. The last two queries were worse and are now better, so that's a win. I'm going to do a test to see if I can find the break-even point between the second query and the fourth. If it turns out to be at quite a large number of columns selected then I think it might be something we can live with. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
On Sun, Mar 4, 2018 at 3:18 PM, David Gould wrote: > On Sun, 4 Mar 2018 07:49:46 -0800 > Jeff Janes wrote: > > > On Wed, Jan 17, 2018 at 4:49 PM, David Gould wrote: > ... > > > > Maybe a well-timed crash caused n_dead_tup to get reset to zero and that > is > > why autovac is not kicking in? What are the pg_stat_user_table number > and > > the state of the visibility map for your massively bloated table, if you > > still have them? > > ... > > The main pain points are that when reltuples gets inflated there is no way > to fix it, auto vacuum stops looking at the table and hand run ANALYZE > can't > reset the reltuples. The only cure is VACUUM FULL, but that is not really > practical without unacceptable amounts of downtime. > But why won't an ordinary manual VACUUM (not FULL) fix it? That seems like that is a critical thing to figure out. As for preventing it in the first place, based on your description of your hardware and operations, I was going to say you need to increase the max number of autovac workers, but then I remembered you from "Autovacuum slows down with large numbers of tables. More workers makes it slower" ( https://www.postgresql.org/message-id/20151030133252.3033.4249%40wrigleys.postgresql.org). So you are probably still suffering from that? Your patch from then seemed to be pretty invasive and so controversial. I had a trivial but fairly effective patch at the time, but it now less trivial because of how shared catalogs are dealt with (commit 15739393e4c3b64b9038d75) and I haven't rebased it over that issue. Cheers, Jeff
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On Tue, Feb 13, 2018 at 12:41 PM, amul sul wrote: > > Thanks for the confirmation, updated patch attached. > > I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch does not do anything to deal with the fact that t_ctid may no longer point to itself to mark end of the chain. I just can't see how that would work. But if it does, it needs good amount of comments explaining why and most likely updating comments at other places where chain following is done. For example, how's this code in heap_get_latest_tid() is still valid? Aren't we setting "ctid" to some invalid value here? 2302 /* 2303 * If there's a valid t_ctid link, follow it, else we're done. 2304 */ 2305 if ((tp.t_data->t_infomask & HEAP_XMAX_INVALID) || 2306 HeapTupleHeaderIsOnlyLocked(tp.t_data) || 2307 ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid)) 2308 { 2309 UnlockReleaseBuffer(buffer); 2310 break; 2311 } 2312 2313 ctid = tp.t_data->t_ctid; This is just one example. I am almost certain there are many such cases that will require careful attention. What happens if a partition key update deletes a row, but the operation is aborted? Do we need any special handling for that case? I am actually worried that we're tinkering with ip_blkid to handle one corner case of detecting partition key update. This is going to change on-disk format and probably need more careful attention. Are we certain that we would never require update-chain following when partition keys are updated? If so, can we think about some other mechanism which actually even leaves behind ? I am not saying we should do that, but it warrants a thought. May be it was discussed somewhere else and ruled out. I happened to notice this patch because of the bug I encountered. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Testbed for predtest.c ... and some arguable bugs therein
In the thread about being able to prove constraint exclusion from an IN clause mentioning a NULL, https://www.postgresql.org/message-id/flat/3bad48fc-f257-c445-feeb-8a2b2fb62...@lab.ntt.co.jp I expressed concern about whether there were existing bugs in predtest.c given the lack of clarity of the comments around recent changes to it. I also noticed that coverage.postgresql.org shows we don't have great test coverage for it. This led me to feel that it'd be worth having a testbed that would allow directly exercising the predtest code, rather than having to construct queries whose plans would change depending on the outcome of a predtest proof. A bit of hacking later, I have the attached. The set of test cases it includes at the moment were mostly developed with an eye to getting to full code coverage of predtest.c, but we could add more later. What's really interesting is that it proves that the "weak refutation" logic, i.e. predicate_refuted_by() with clause_is_check = true, is not self-consistent. Now as far as I can tell, we are not using that option anywhere yet, so this is just a latent problem not a live one. Also, it's not very clear what semantics we'd be needing if we did have a use for the case. Presumably the starting assumption is "clause does not return false", but do we need to prove "predicate returns false", or just "predicate does not return true"? I'm not sure, TBH, but if it's the former then we're going to have results like "X does not refute NOT X", which seems weird and is certainly not how that code acts now. So I made the testbed assume that we're supposed to prove ""predicate does not return true", and the conclusion is that the code mostly behaves that way, but there are cases where it incorrectly claims a proof, and more cases where it fails to prove relationships it perhaps could. I'm not sure that that's worth fixing right now. Instead I'm tempted to revert the addition of the clause_is_check argument to predicate_refuted_by, on the grounds that it's both broken and currently unnecessary. Anyway, barring objections, I'd like to push this, and then we can use it to carry a test for the null-in-IN fix whenever that lands. regards, tom lane diff --git a/src/test/modules/test_predtest/.gitignore b/src/test/modules/test_predtest/.gitignore index ...5dcb3ff . *** a/src/test/modules/test_predtest/.gitignore --- b/src/test/modules/test_predtest/.gitignore *** *** 0 --- 1,4 + # Generated subdirectories + /log/ + /results/ + /tmp_check/ diff --git a/src/test/modules/test_predtest/Makefile b/src/test/modules/test_predtest/Makefile index ...1b50fa3 . *** a/src/test/modules/test_predtest/Makefile --- b/src/test/modules/test_predtest/Makefile *** *** 0 --- 1,21 + # src/test/modules/test_predtest/Makefile + + MODULE_big = test_predtest + OBJS = test_predtest.o $(WIN32RES) + PGFILEDESC = "test_predtest - test code for optimizer/util/predtest.c" + + EXTENSION = test_predtest + DATA = test_predtest--1.0.sql + + REGRESS = test_predtest + + ifdef USE_PGXS + PG_CONFIG = pg_config + PGXS := $(shell $(PG_CONFIG) --pgxs) + include $(PGXS) + else + subdir = src/test/modules/test_predtest + top_builddir = ../../../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk + endif diff --git a/src/test/modules/test_predtest/README b/src/test/modules/test_predtest/README index ...2c9bec0 . *** a/src/test/modules/test_predtest/README --- b/src/test/modules/test_predtest/README *** *** 0 --- 1,28 + test_predtest is a module for checking the correctness of the optimizer's + predicate-proof logic, in src/backend/optimizer/util/predtest.c. + + The module provides a function that allows direct application of + predtest.c's exposed functions, predicate_implied_by() and + predicate_refuted_by(), to arbitrary boolean expressions, with direct + inspection of the results. This could be done indirectly by checking + planner results, but it can be difficult to construct end-to-end test + cases that prove that the expected results were obtained. + + In general, the use of this function is like + select * from test_predtest('query string') + where the query string must be a SELECT returning two boolean + columns, for example + + select * from test_predtest($$ + select x, not x + from (values (false), (true), (null)) as v(x) + $$); + + The function parses and plans the given query, and then applies the + predtest.c code to the two boolean expressions in the SELECT list, to see + if the first expression can be proven or refuted by the second. It also + executes the query, and checks the resulting rows to see whether any + claimed implication or refutation relationship actually holds. If the + query is designed to exercise the expressions on a full set of possible + input values, as in the example above, then this provides a mechanical + cross-check as to whether the p
Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw
On Wed, Mar 7, 2018 at 8:55 AM, Stephen Frost wrote: > Greetings Robert, Ashutosh, Arthur, Etsuro, all, > > * Arthur Zakirov (a.zaki...@postgrespro.ru) wrote: >> On Tue, Mar 06, 2018 at 08:09:50PM +0900, Etsuro Fujita wrote: >> > Agreed. I added a comment to that function. I think that that comment in >> > combination with changes to the FDW docs in the patch would help FDW >> > authors >> > understand why that is needed. Please find attached an updated version of >> > the patch. >> >> Thank you. >> >> All tests pass, the documentation builds. There was the suggestion [1] >> of different approach. But the patch fix the issue in much more simple >> way. >> >> Marked as "Ready for Commiter". >> >> 1 - >> https://www.postgresql.org/message-id/20171005.200631.134118679.horiguchi.kyotaro%40lab.ntt.co.jp > > Thanks, I've looked through this patch and thread again and continue to > feel that this is both a good and sensible improvment and that the patch > is in pretty good shape. > > The remaining question is if the subsequent discussion has swayed the > opinion of Robert and Ashutosh. If we can get agreement that these > semantics are acceptable and an improvement over the status quo then I'm > happy to try and drive this patch to commit. > > Robert, Ashutosh? If there is a local constraint on the foreign table, we don't check it. So, a row that was inserted through this foreign table may not show up when selected from the foreign table. Apply same logic to the WCO on a view on the foreign table, it should be fine if a row inserted through the view doesn't show up in the view. Somebody who created the view, knew that it's a foreign table underneath. Stephen said [1] that the view is local and irrespective of what's underneath it, it should obey WCO. Which seems to be a fair point when considered alone, but with the above context, it doesn't look any fair. Etsuro said [2] that WCO constraints can not be implemented on foreign server and normal check constraints can be, and for that he provides an example in [3]. But I think that example is going the wrong direction. For local constraints to be enforced, we use remote constraints. For local WCO we need to use remote WCO. That means we create many foreign tables pointing to same local table on the foreign server through many views, but it's not impossible. [1] https://www.postgresql.org/message-id/20180117130021.GC2416%40tamriel.snowman.net [2] https://www.postgresql.org/message-id/5A058F21.2040201%40lab.ntt.co.jp [3] https://www.postgresql.org/message-id/a3955a1d-ad07-5b0a-7618-b6ef5ff0e1c5%40lab.ntt.co.jp -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: RFC: Add 'taint' field to pg_control.
Craig Ringer writes: > As I understand it, because we allow multiple Pg instances on a system, we > identify the small sysv shmem segment we use by the postmaster's pid. If > you remove the DirLockFile (postmaster.pid) you remove the interlock > against starting a new postmaster. It'll think it's a new independent > instance on the same host, make a new shmem segment and go merrily on its > way mangling data horribly. Yeah. If we realized that the old shmem segment was associated with this data directory, we could check for processes still attached to it ... but the lock file is exactly where that association is kept. > It'd be nice if the OS offered us some support here. Something like opening > a lockfile in exclusive lock mode, then inheriting the FD and lock on all > children, with each child inheriting the lock. So the exclusive lock > wouldn't get released until all FDs associated with it are released. But > AFAIK nothing like that is present, let alone portable. I've wondered about that too. An inheritable lock on the data directory itself would be ideal, but that doesn't exist anywhere AFAIK. We could imagine taking a BSD-style-flock(2) lock on some lock file, on systems that have that; but not all do, so it couldn't be our only protection. Another problem is that since it'd have to be an ordinary file, there'd still be a risk of cowboy DBAs removing the lock file. Maybe we could use pg_control as the lock file? regards, tom lane
Re: PATCH: psql tab completion for SELECT
New patch that fixes a little bug with appending NULL addons to schema queries. psql-select-tab-completion-v6.patch Description: Binary data
Re: Protect syscache from bloating with negative cache entries
Kyotaro HORIGUCHI writes: > At Thu, 8 Mar 2018 00:28:04 +, "Tsunakawa, Takayuki" > wrote in > <0A3221C70F24FB45833433255569204D1F8FF0D9@G01JPEXMBYT05> >> Yes. We are now facing the problem of too much memory use by PostgreSQL, >> where about some applications randomly access about 200,000 tables. It is >> estimated based on a small experiment that each backend will use several to >> ten GBs of local memory for CacheMemoryContext. The total memory use will >> become over 1 TB when the expected maximum connections are used. >> >> I haven't looked at this patch, but does it evict all kinds of entries in >> CacheMemoryContext, ie. relcache, plancache, etc? > This works only for syscaches, which could bloat with entries for > nonexistent objects. > Plan cache is a utterly deferent thing. It is abandoned at the > end of a transaction or such like. When I was at Salesforce, we had *substantial* problems with plancache bloat. The driving factor there was plans associated with plpgsql functions, which Salesforce had a huge number of. In an environment like that, there would be substantial value in being able to prune both the plancache and plpgsql's function cache. (Note that neither of those things are "abandoned at the end of a transaction".) > Relcache is not based on catcache and out of the scope of this > patch since it doesn't get bloat with nonexistent entries. It > uses dynahash and we could introduce a similar feature to it if > we are willing to cap relcache size. I think if the case of concern is an application with 200,000 tables, it's just nonsense to claim that relcache size isn't an issue. In short, it's not really apparent to me that negative syscache entries are the major problem of this kind. I'm afraid that you're drawing very large conclusions from a specific workload. Maybe we could fix that workload some other way. regards, tom lane
Re: RFC: Add 'taint' field to pg_control.
On 8 March 2018 at 10:18, Andres Freund wrote: > > > On March 7, 2018 5:51:29 PM PST, Craig Ringer > wrote: > >My favourite remains an organisation that kept "fixing" an issue by > >kill > >-9'ing the postmaster and removing postmaster.pid to make it start up > >again. Without killing all the leftover backends. Of course, the system > >kept getting more unstable and broken, so they did it more and more > >often. > >They were working on scripting it when they gave up and asked for help. > > Maybe I'm missing something, but that ought to not work. The shmem segment > that we keep around would be a conflict, no? > > As I understand it, because we allow multiple Pg instances on a system, we identify the small sysv shmem segment we use by the postmaster's pid. If you remove the DirLockFile (postmaster.pid) you remove the interlock against starting a new postmaster. It'll think it's a new independent instance on the same host, make a new shmem segment and go merrily on its way mangling data horribly. See CreateLockFile(). Also 7e2a18a9161 . In particular src/backend/utils/init/miscinit.c +938, if (isDDLock) { if (PGSharedMemoryIsInUse(id1, id2)) ereport(FATAL, (errcode(ERRCODE_LOCK_FILE_EXISTS), errmsg("pre-existing shared memory block " "(key %lu, ID %lu) is still in use", id1, id2), errhint("If you're sure there are no old " "server processes still running, remove " "the shared memory block " "or just delete the file \"%s\".", filename))); } I still think that error is a bit optimistic, and should really say "make very sure there are no 'postgres' processes associated with this data directory, then ...' It'd be nice if the OS offered us some support here. Something like opening a lockfile in exclusive lock mode, then inheriting the FD and lock on all children, with each child inheriting the lock. So the exclusive lock wouldn't get released until all FDs associated with it are released. But AFAIK nothing like that is present, let alone portable. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Protect syscache from bloating with negative cache entries
Hello, At Thu, 8 Mar 2018 00:28:04 +, "Tsunakawa, Takayuki" wrote in <0A3221C70F24FB45833433255569204D1F8FF0D9@G01JPEXMBYT05> > From: Alvaro Herrera [mailto:alvhe...@alvh.no-ip.org] > > The thing that comes to mind when reading this patch is that some time ago > > we made fun of other database software, "they are so complicated to > > configure, > > they have some magical settings that few people understand how to set". > > Postgres was so much better because it was simple to set up, no magic crap. > > But now it becomes apparent that that only was so because Postgres sucked, > > ie., we hadn't yet gotten to the point where we > > *needed* to introduce settings like that. Now we finally are? > > Yes. We are now facing the problem of too much memory use by PostgreSQL, > where about some applications randomly access about 200,000 tables. It is > estimated based on a small experiment that each backend will use several to > ten GBs of local memory for CacheMemoryContext. The total memory use will > become over 1 TB when the expected maximum connections are used. > > I haven't looked at this patch, but does it evict all kinds of entries in > CacheMemoryContext, ie. relcache, plancache, etc? This works only for syscaches, which could bloat with entries for nonexistent objects. Plan cache is a utterly deferent thing. It is abandoned at the end of a transaction or such like. Relcache is not based on catcache and out of the scope of this patch since it doesn't get bloat with nonexistent entries. It uses dynahash and we could introduce a similar feature to it if we are willing to cap relcache size. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
On Wed, Feb 28, 2018 at 12:38 PM, Rajkumar Raghuwanshi < rajkumar.raghuwan...@enterprisedb.com> wrote: > On Wed, Feb 14, 2018 at 5:44 PM, Amit Kapila > wrote: > >> +# Concurrency error from GetTupleForTrigger >> +# Concurrency error from ExecLockRows >> >> I think you don't need to mention above sentences in spec files. >> Apart from that, your patch looks good to me. I have marked it as >> Ready For Committer. >> > > I too have tested this feature with isolation framework and this look good > to me. > > It looks to me that we are trying to fix only one issue here with concurrent updates. What happens if a non-partition key is first updated and then a second session updates the partition key? For example, with your patches applied: CREATE TABLE pa_target (key integer, val text) PARTITION BY LIST (key); CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1); CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2); INSERT INTO pa_target VALUES (1, 'initial1'); session1: BEGIN; UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1; UPDATE 1 postgres=# SELECT * FROM pa_target ; key | val -+- 1 | initial1 *updated by update1* (1 row) session2: UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE key = 1 session1: postgres=# COMMIT; COMMIT postgres=# SELECT * FROM pa_target ; key | val -+- 2 | initial1 updated by update2 (1 row) Ouch. The committed updates by session1 are overwritten by session2. This clearly violates the rules that rest of the system obeys and is not acceptable IMHO. Clearly, ExecUpdate() while moving rows between partitions is missing out on re-constructing the to-be-updated tuple, based on the latest tuple in the update chain. Instead, it's simply deleting the latest tuple and inserting a new tuple in the new partition based on the old tuple. That's simply wrong. I haven't really thought carefully to see if this should be a separate patch, but it warrants attention. We should at least think through all different concurrency aspects of partition key updates and think about a holistic solution, instead of fixing one problem at a time. This probably also shows that isolation tests for partition key updates are either missing (I haven't checked) or they need more work. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
On Fri, Mar 2, 2018 at 5:17 PM, Tom Lane wrote: > (1) do we really want to go over to treating ANALYZE's tuple density > result as gospel, contradicting the entire thrust of the 2011 discussion? > >> This tables reltuples is 18 times the actual row count. It will never >> converge >> because with 5953 pages analyze can only adjust reltuples by 0.0006 each >> time. > > But by the same token, analyze only looked at 0.0006 of the pages. It's > nice that for you, that's enough to get a robust estimate of the density > everywhere; but I have a nasty feeling that that won't hold good for > everybody. The whole motivation of the 2011 discussion, and the issue > that is also seen in some other nearby discussions, is that the density > can vary wildly. I think that viewing ANALYZE's result as fairly authoritative is probably a good idea. If ANALYZE looked at only 0.0006 of the pages, that's because we decided that 0.0006 of the pages were all it needed to look at in order to come up with good estimates. Having decided that, we should turn around and decide that they are 99.94% bunk. Now, it could be that there are data sets out there were the number of tuples per page varies widely between different parts of the table, and therefore sampling 0.0006 of the pages can return quite different estimates depending on which ones we happen to pick. However, that's a lot like saying that 0.0006 of the pages isn't really enough, and maybe the solution is to sample more. Still, it doesn't seem unreasonable to do some kind of smoothing, where we set the new estimate = (newly computed estimate * X) + (previous estimate * (1 - X)) where X might be 0.25 or whatever; perhaps X might even be configurable. One thing to keep in mind is that VACUUM will, in many workloads, tend to scan the same parts of the table over and over again. For example, consider a database of chess players which is regularly updated with new ratings information. The records for active players will be updated frequently, but the ratings for deceased players will rarely change. Living but inactive players may occasionally become active again, or may be updated occasionally for one reason or another. So, VACUUM will keep scanning the pages that contain records for active players but will rarely or never be asked to scan the pages for dead players. If it so happens that these different groups of players have rows of varying width -- perhaps we store more detailed data about newer players but don't have full records for older ones -- then the overall tuple density estimate will come to resemble more and more the density of the rows that are actively being updated, rather than the overall density of the whole table. Even if all the tuples are the same width, it might happen in some workload that typically insert a record, update it N times, and then it stays fixed after that. Suppose we can fit 100 tuples into a page. On pages were all of the records have reached their final state, there will be 100 tuples. But on pages where updates are still happening there will -- after VACUUM -- be fewer than 100 tuples per page, because some fraction of the tuples on the page were dead row versions. That's why we were vacuuming them. Suppose typically half the tuples on each page are getting removed by VACUUM. Then, over time, as the table grows, if only VACUUM is ever run, our estimate of tuples per page will converge to 50, but in reality, as the table grows, the real number is converging to 100. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Server won't start with fallback setting by initdb.
On Wed, Mar 7, 2018 at 6:43 PM, Michael Paquier wrote: > On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote: >> OK, seems like I'm on the short end of that vote. I propose to push the >> GUC-crosschecking patch I posted yesterday, but not the default-value >> change, and instead push Kyotaro-san's initdb change. Should we back-patch >> these things to v10 where the problem appeared? > > I would vote for a backpatch. If anybody happens to run initdb on v10 > and gets max_connections to 10, that would immediately cause a failure. > We could also wait for sombody to actually complain about that, but a > bit of prevention does not hurt to ease future user experience on this > released version. In theory, back-patching the GUC-crosschecking patch could cause the cluster to fail to restart after the upgrade. It's pretty unlikely. We have to postulate someone with, say, default values but for max_connections=12. But it's not impossible. I would be inclined to back-patch the increase in the max_connections fallback from 10 to 20 because that fixes a real, if unlikely, failure mode, but treat the GUC cross-checking stuff as a master-only improvement. Although it's unlikely to hurt many people, there's no real upside. Nobody is going to say "boy, it's a good thing they tidied that GUC cross-checking in the latest major release -- that really saved my bacon!". Nothing is really broken as things stand. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
On Tue, 06 Mar 2018 11:16:04 -0500 Tom Lane wrote: > so that we can decide whether this bug is bad enough to justify > back-patching a behavioral change. I remain concerned that the proposed > fix is too simplistic and will have some unforeseen side-effects, so > I'd really rather just put it in HEAD and let it go through a beta test > cycle before it gets loosed on the world. It happens to us fairly regularly and causes lots of problems. However, I'm agreeable to putting it in head for now, my client can build from source to pick up this patch until that ships, but doesn't want to maintain their own fork forever. That said, if it does get though beta I'd hope we could back-patch at that time. -dg -- David Gould da...@sonic.net If simplicity worked, the world would be overrun with insects.
Re: Some message fixes
At Wed, 7 Mar 2018 07:10:34 -0300, Alvaro Herrera wrote in <20180307101034.l7z7kqwqfkjg6c2p@alvherre.pgsql> > Kyotaro HORIGUCHI wrote: > > > 1. some messages are missing partitioned table/index .. > I *think* the idea here is that a partitioned table is a table, so there > is no need to say "foo is not a table or partitioned table". We only > mention partitioned tables when we want to make a distinction between > those that are partitioned and those that aren't. I agree with the direction and code seems following that. I should have looked wider. Thanks. > > 2. GUC comment for autovacuum_work_mem has a typo, "max num of > >parallel workers *than* can be..". (Third attached) > > Yeah, pushed, though it's parallel_max_workers not autovacuum_work_mem. Ugg. Inconsistency within a line.. My fingers may have decided to do somewhat different from my brain. Anyway, thank you for commiting. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Re: RFC: Add 'taint' field to pg_control.
On March 7, 2018 5:51:29 PM PST, Craig Ringer wrote: >My favourite remains an organisation that kept "fixing" an issue by >kill >-9'ing the postmaster and removing postmaster.pid to make it start up >again. Without killing all the leftover backends. Of course, the system >kept getting more unstable and broken, so they did it more and more >often. >They were working on scripting it when they gave up and asked for help. Maybe I'm missing something, but that ought to not work. The shmem segment that we keep around would be a conflict, no? Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: FOR EACH ROW triggers on partitioned tables
Alvaro Herrera wrote: > I reserve the right to revise this further, as I'm going to spend a > couple of hours looking at it this afternoon, particularly to see how > concurrent DDL behaves, but I don't see anything obviously wrong with > it. I do now. TLDR; I'm afraid this cute idea crashed and burned, so I'm back to the idea of just cloning the pg_trigger row for each partition. The reason for the failure is pg_trigger->tgqual, which is an expression tree. In most cases, when set, that expression will contain references to columns of the table, in the form of a varattno. But this varattno references the column number of the partitioned table; and if the partition happens to require some attribute mapping, we're screwed because there is no way to construct that without forming the partitioned table's tuple descriptor. But we can't do that without grabbing a lock on the partitioned table; and we can't do that because we would incur the deadlock risk Robert was talking about. An example that causes the problem is: create table parted_irreg (fd int, a int, fd2 int, b text) partition by range (b); alter table parted_irreg drop column fd, drop column fd2; create table parted1_irreg (b text, fd int, a int); alter table parted1_irreg drop column fd; alter table parted_irreg attach partition parted1_irreg for values from ('') to (''); create trigger parted_trig after insert on parted_irreg for each row when (new.a % 1 = 0) execute procedure trigger_notice_irreg(); insert into parted_irreg values (1, 'aardvark'); insert into parted1_irreg values ('aardwolf', 2); drop table parted_irreg; drop function trigger_notice_irreg(); Both inserts fail thusly: ERROR: attribute 2 of type parted1_irreg has been dropped Now, I can fix the first failure by taking advantage of ResultRelInfo->ri_PartitionRoot during trigger execution; it's easy and trouble-free to call map_variable_attnos() using that relation. But in the second insert, ri_PartitionRoot is null (because of inserting into the partition directly), so we have no relation to refer to for map_variable_attnos(). I think it gets worse: if you have a three-level partitioning scheme, and define the trigger in the second one, there is no relation either. Another option I think would be to always keep in the trigger descriptor ("somehow"), an open Relation on which the trigger is defined. But this has all sorts of problems also, so I'm not doing that. I guess another option is to store a column map somewhere. So, unless someone has a brilliant idea on how to construct a column mapping from partitioned table to partition, I'm going back to the design I was proposing earlier, ie., creating individual pg_trigger rows for each partition that are essentially adjusted copies of the ones for the partitioned table. The only missing thing in that one was having ALTER TABLE ENABLE/DISABLE for a trigger on the partitioned table cascade to the partitions; I'll see about that. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgstat_report_activity() and parallel CREATE INDEX (was: Parallel index creation & pg_stat_activity)
On Thu, Mar 1, 2018 at 2:47 PM, Peter Geoghegan wrote: > No. Just an oversight. Looks like _bt_parallel_build_main() should > call pgstat_report_activity(), just like ParallelQueryMain(). > > I'll come up with a patch soon. Attached patch has parallel CREATE INDEX propagate debug_query_string to workers. Workers go on to use this string as their own debug_query_string, as well as registering it using pgstat_report_activity(). Parallel CREATE INDEX pg_stat_activity entries will have a query text, too, which addresses Phil's complaint. I wasn't 100% sure if I should actually show the leader's debug_query_string within worker pg_stat_activity entries, since that isn't what parallel query uses (the QueryDesc/Estate query string in shared memory is *restored* as the debug_query_string for a parallel query worker, though). I eventually decided that this debug_query_string approach was okay. There is nothing remotely like a QueryDesc or EState passed to btbuild(). I can imagine specific issues with what I've done, such as a CREATE EXTENSION command that contains a CREATE INDEX, and yet shows a CREATE EXTENSION in pg_stat_activity for parallel workers. However, that's no different to what you'd see with a serial index build. Existing tcop/postgres.c pgstat_report_activity() calls are aligned with setting debug_query_string. -- Peter Geoghegan From 1c380b337be86bb3c69cf70b39da1d3dd1fba18a Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Wed, 7 Mar 2018 14:22:56 -0800 Subject: [PATCH] Report query text in parallel CREATE INDEX workers. Commit 4c728f38297 established that parallel query should propagate a debug_query_string to worker processes, and display query text in worker pg_stat_activity entries. Parallel CREATE INDEX should follow this precedent. This fixes an oversight in commit 9da0cc35. Author: Peter Geoghegan Reported-By: Phil Florent Discussion: https://postgr.es/m/cah2-wzkryapcqohbjkudkfni-hgfny31yjcbm-yp5ho-71i...@mail.gmail.com --- src/backend/access/nbtree/nbtsort.c | 30 +- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c index f0c276b52a..098e0ce1be 100644 --- a/src/backend/access/nbtree/nbtsort.c +++ b/src/backend/access/nbtree/nbtsort.c @@ -86,6 +86,7 @@ #define PARALLEL_KEY_BTREE_SHARED UINT64CONST(0xA001) #define PARALLEL_KEY_TUPLESORT UINT64CONST(0xA002) #define PARALLEL_KEY_TUPLESORT_SPOOL2 UINT64CONST(0xA003) +#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xA004) /* * DISABLE_LEADER_PARTICIPATION disables the leader's participation in @@ -1195,6 +1196,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request) BTSpool*btspool = buildstate->spool; BTLeader *btleader = (BTLeader *) palloc0(sizeof(BTLeader)); bool leaderparticipates = true; + char *sharedquery; + int querylen; #ifdef DISABLE_LEADER_PARTICIPATION leaderparticipates = false; @@ -1223,9 +1226,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request) snapshot = RegisterSnapshot(GetTransactionSnapshot()); /* - * Estimate size for at least two keys -- our own - * PARALLEL_KEY_BTREE_SHARED workspace, and PARALLEL_KEY_TUPLESORT - * tuplesort workspace + * Estimate size for our own PARALLEL_KEY_BTREE_SHARED workspace, and + * PARALLEL_KEY_TUPLESORT tuplesort workspace */ estbtshared = _bt_parallel_estimate_shared(snapshot); shm_toc_estimate_chunk(&pcxt->estimator, estbtshared); @@ -1234,7 +1236,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request) /* * Unique case requires a second spool, and so we may have to account for - * a third shared workspace -- PARALLEL_KEY_TUPLESORT_SPOOL2 + * another shared workspace for that -- PARALLEL_KEY_TUPLESORT_SPOOL2 */ if (!btspool->isunique) shm_toc_estimate_keys(&pcxt->estimator, 2); @@ -1244,6 +1246,11 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request) shm_toc_estimate_keys(&pcxt->estimator, 3); } + /* Finally, estimate PARALLEL_KEY_QUERY_TEXT space */ + querylen = strlen(debug_query_string); + shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1); + shm_toc_estimate_keys(&pcxt->estimator, 1); + /* Everyone's had a chance to ask for space, so now create the DSM */ InitializeParallelDSM(pcxt); @@ -1293,6 +1300,11 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request) shm_toc_insert(pcxt->toc, PARALLEL_KEY_TUPLESORT_SPOOL2, sharedsort2); } + /* Store query string for workers */ + sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1); + memcpy(sharedquery, debug_query_string, querylen + 1); + shm_toc_insert(pcxt->toc, PARALLEL_KEY_QUERY_TEXT, sharedquery); + /* Launch workers, saving status for leader/caller */ LaunchParallelWorkers(pcxt); btleader->pcxt = pcxt; @@ -1459,6 +1471,7 @@ _bt_leader
Re: RFC: Add 'taint' field to pg_control.
On 8 March 2018 at 04:58, Robert Haas wrote: > On Wed, Feb 28, 2018 at 8:03 PM, Craig Ringer > wrote: > > A huge +1 from me for the idea. I can't even count the number of black > box > > "WTF did you DO?!?" servers I've looked at, where bizarre behaviour has > > turned out to be down to the user doing something very silly and not > saying > > anything about it. > > +1 from me, too. > > My favourite remains an organisation that kept "fixing" an issue by kill -9'ing the postmaster and removing postmaster.pid to make it start up again. Without killing all the leftover backends. Of course, the system kept getting more unstable and broken, so they did it more and more often. They were working on scripting it when they gave up and asked for help. The data recovery effort on that one was truly exciting. I remember looking at bash history and having to take a short break to figure out how on earth to communicate what was going on. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: PATCH: Configurable file mode mask
On Wed, Mar 07, 2018 at 10:56:32AM -0500, David Steele wrote: > On 3/6/18 10:04 PM, Michael Paquier wrote: >> Seems like you forgot to re-add the chmod calls in initdb.c. > > Hmmm, I thought we were talking about moving the position of umask(). > > I don't think the chmod() calls are needed because umask() is being set. > The tests show that the config files have the proper permissions after > inidb, so this just looks like redundant code to me. Let's discuss that on a separate thread then, there could be something we are missing, but I agree that those should not be needed. Looking at the code history, those calls have been around since the beginning of PG-times. > Another way to do this would be to make the function generic and > stipulate that the postmaster must be shut down before running the > function. We could check postmaster.pid permissions as a separate > test. Yeah, that looks like a sensitive approach as this could be run post-initdb or after taking a backup. This way we would avoid other similar behaviors in the future... Still postmaster.pid is an exception. >> sub clean_rewind_test >> { >> + ok (check_pg_data_perm($node_master->data_dir(), 0)); >> + >> $node_master->teardown_node if defined $node_master; >> I have also a pending patch for pg_rewind which adds read-only files in >> the data folders of a new test, so this would cause this part to blow >> up. Testing that for all the test sets does not bring additional value >> as well, and doing it at cleanup phase is also confusing. So could you >> move that check into only one test's script? You could remove the umask >> call in 003_extrafiles.pl as well this way, and reduce the patch diffs. > > I think I would rather have a way to skip the permission test rather > than disable it for most of the tests. pg_rewind does more writes into > PGDATA that anything other than a backend. Good coverage seems like a > plus. All the tests basically run the same scenario, wiht minimal variations, so let's do the test once in the test touching the highest amount of files and call it good. >> Patch 2 is getting close for a committer lookup I think, still need to >> look at patch 3 in details.. And from patch 3: >> # Expected permission >> -my $expected_file_perm = 0600; >> -my $expected_dir_perm = 0700; >> +my $expected_file_perm = $allow_group ? 0640 : 0600; >> +my $expected_dir_perm = $allow_group ? 0750 : 0700; >> Or $expected_file_perm and $expected_dir_perm could just be used as >> arguments. > > This gets back to the check_pg_data_perm() discussion above. > > I'll hold off on another set of patches until I hear back from you. > There were only trivial changes as noted above. OK, let's keep things simple here and assume that the grouping extension is not available yet. So no extra parameters should be needed. I have begun to read through patch 3. -if (stat_buf.st_mode & (S_IRWXG | S_IRWXO)) +if (stat_buf.st_mode & PG_MODE_MASK_ALLOW_GROUP) ereport(FATAL, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), - errmsg("data directory \"%s\" has group or world access", + errmsg("data directory \"%s\" has invalid permissions", DataDir), - errdetail("Permissions should be u=rwx (0700)."))); + errdetail("Permissions should be u=rwx (0700) or u=rwx,g=rx (0750)."))); Hm. This relaxes the checks and concerns me a lot. What if users want to keep strict permission all the time and rely on the existing check to be sure that this gets never changed post-initdb? For such users, we may want to track if cluster has been initialized with group access, in which case tracking that in the control file would be more adapted. Then the startup checks should use this configuration. Another idea would be a startup option. So, I cannot believe that all users would like to see such checks relaxed. Some of my users would surely complain about such sanity checks relaxed unconditionally, so making this configurable would be fine, and the current approach is not acceptable in my opinion. Patch 2 introduces some standards regarding file permissions as those are copied all over the code tree, which is definitely good in my opinion. I am rather reserved about patch 3 per what I am seeing now. Looking in-depth at the security-related requirements would take more time than I have now. -- Michael signature.asc Description: PGP signature
Re: Two-phase update of restart_lsn in LogicalConfirmReceivedLocation
On 8 March 2018 at 07:32, Tom Lane wrote: > Robert Haas writes: > > On Thu, Mar 1, 2018 at 2:03 AM, Craig Ringer > wrote: > >> So I can't say it's definitely impossible. It seems astonishingly > unlikely, > >> but that's not always good enough. > > > Race conditions tend to happen a lot more often than one might think. > > Just to back that up --- we've seen cases where people could repeatably > hit race-condition windows that are just an instruction or two wide. > The first one I came to in an idle archive search is > https://www.postgresql.org/message-id/15543.1130714273%40sss.pgh.pa.us > I vaguely recall others but don't feel like digging harder right now. > > That's astonishing. I guess if you repeat something enough times... The reason I'm less concerned about this one is that you have to crash in exactly the wrong place, *while* during a badly timed point in a race. But the downside is that the result would be an unusable logical slot. The simplest solution is probably just to mark the slot dirty while we hold the spinlock, at the same time we advance its restart lsn. Any checkpoint will then CheckPointReplicationSlots() and flush it. We don't remove/recycle xlog segments until after that's done in CheckPointGuts() so it's guaranteed that the slot's new state will be on disk and we can never have a stale restart_lsn pointing into truncated-away WAL. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
On March 7, 2018 5:40:18 PM PST, Peter Geoghegan wrote: >On Wed, Mar 7, 2018 at 5:16 PM, Tomas Vondra > wrote: >> FWIW that's usually written to the system log. Does dmesg say >something >> about the kill? > >While it would be nice to confirm that it was indeed the OOM killer, >either way the crash happened because SIGKILL was sent to a parallel >worker. There is no reason to suspect a bug. Not impossible there's a leak somewhere though. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
On Wed, Mar 7, 2018 at 5:16 PM, Tomas Vondra wrote: > FWIW that's usually written to the system log. Does dmesg say something > about the kill? While it would be nice to confirm that it was indeed the OOM killer, either way the crash happened because SIGKILL was sent to a parallel worker. There is no reason to suspect a bug. -- Peter Geoghegan
archive_command
Hi, I’ve been encouraged to submit this code as there has been talk in the past about a simple pgcopy command to use with the archive_command. Currently there is really no good solution in most base systems without having to introduce a dedicated third party Postgres solution. The best base system solution and most commonly used today is rsync which doesn’t fsync the file after completing the transfer leaving no good recommendations. I not sure what would need to be done to introduce this into a given commitfest. Also, would need to know if the command interface is acceptable and/or other features should be added. It currently is very simple as it just reads from standard input and saves data to the indicated file. Please let me know how to proceed. Thanks, Rui Example of local compressed archive: archive_command=“xz -c %p | fwrite /mnt/server/archivedir/%f” Example of remote archive via a shell script: #!/usr/bin/env bash . . . # SSH Command and options SSH_CMD="ssh -o ServerAliveInterval=20 $ARCH_SERVER" STS=3 OUTPUT=$(cat $XLOGFILE | $SSH_CMD "(mkdir -p $ARCH_DIR && fwrite $ARCH_DIR/$WALFILE) 2>&1") echo ${PIPESTATUS[@]} | grep -qE '^[0 ]+$' if [ $? == 0 ]; then STS=0 fi exit $STS fwrite code: #include #include #include #include #include #include #include #include #define BUFSIZE 131072 int main(int argc, char *argv[]) { int fd, r, w; char *buf; char *name; struct stat fstat; if (argc != 2) { fprintf(stderr, "usage: fwrite [file]\n"); exit(1); } if ((buf = malloc(BUFSIZE)) == NULL) err(1, "malloc"); ++argv; if ((name = (char *) malloc(strlen(*argv) + 8)) == NULL) err(1, "malloc"); strcat(strcpy(name, *argv), ".XX"); if ((fd = mkstemp(name)) < 0) err(1, "mkstemp"); while ((r = read(STDIN_FILENO, buf, BUFSIZE)) > 0) if ((w = write(fd, buf, r)) == -1) { unlink(name); err(1, "write"); } if (r < 0) err(1, "read"); if (lseek(fd, 0, SEEK_CUR) <= 0) { unlink(name); errx(1, "zero byte file!"); } if (fsync(fd) != 0) err(1, "fsync"); if (close(fd) != 0) err(1, "close"); if (access(*argv, F_OK) < 0 && errno == ENOENT) { if (rename(name, *argv) != 0) err(1, "rename"); } else { unlink(name); errx(1, "file exists already!"); } free(name); exit(0); }
Re: Add default role 'pg_access_server_files'
On Thu, Mar 08, 2018 at 10:15:11AM +0900, Michael Paquier wrote: > Other than that the patch looks in pretty good shape to me. The regression tests of file_fdw are blowing up because of an error string patch 2 changes. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
On 03/07/2018 03:21 PM, Robert Haas wrote: > On Wed, Mar 7, 2018 at 8:59 AM, Prabhat Sahu > mailto:prabhat.s...@enterprisedb.com>> > wrote: > > 2018-03-07 19:24:44.263 IST [54400] LOG: background worker > "parallel worker" (PID 54482) was terminated by signal 9: Killed > > > That looks like the background worker got killed by the OOM killer. How > much memory do you have in the machine where this occurred? > FWIW that's usually written to the system log. Does dmesg say something about the kill? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Add default role 'pg_access_server_files'
On Wed, Mar 07, 2018 at 07:00:03AM -0500, Stephen Frost wrote: > * Michael Paquier (mich...@paquier.xyz) wrote: >> First, could it be possible to do a split for 1) and 2)? > > Done, because it was less work than arguing about it, but I'm not > convinced that we really need to split out patches to this level of > granularity. Perhaps something to consider for the future. One patch should defend one idea, this makes the git history easier to understand in my opinion. > Consider a case like postgresql.conf residing outside of the data > directory. For an application to be able to read that with > pg_read_file() is very straight-forward and applications already exist > which do. Forcing that application to go through COPY would require > creating a TEMP table and then coming up with a COPY command that > doesn't actually *do* what COPY is meant to do- that is, parse the file. > By default, you'd get errors from such a COPY command as it would think > there's extra columns defined in the "copy-format" or "csv" or > what-have-you file. Hm, OK... >> Could you update the documentation of pg_rewind please? It seems to me >> that with your patch then superuser rights are not necessary mandatory, > > This will require actual testing to be done before I'd make such a > change. I'll see if I can do that soon, but I also wouldn't complain if > someone else wanted to actually go through and set that up and test that > it works. That's what I did, manually, using the attached SQL script with your patch 1 applied. You can check for the list of functions used by pg_rewind in libpq_fetch.c where you just need to grant execution access to those functions for a login user and you are good to go: pg_ls_dir(text, boolean, boolean) pg_stat_file(text, boolean) pg_read_binary_file(text) pg_read_binary_file(text, bigint, bigint, boolean) So getting this documented properly would be nice. Of course feel free to test this by yourself as you wish. Could you send separate files for each patch by the way? This eases testing, and I don't recall that git am has a way to enforce only a subset of patches to be applied based on one file, though my git-fu is limited in this area. + read or which program is run. In principle regular users could be allowed to change the other options, but that's not supported at present. Well, the parsing of file_fdw complains if "program" or "filename" is not defined, so a user has to be in either pg_read_server_files to use "filename" or in pg_execute_server_program to use "program", so I am not sure that the last sentence of this paragraph makes much sense from the beginning. -Return the contents of a file. +Return the contents of a file. Restricted to superusers by default, but other users can be granted EXECUTE to run the function. You should make those paragraphs spawn into multiple lines. EXECUTE should use markup, and I think that you should use "EXECUTE privilege to run the function" on those doc portions. That's all for the nits. Other than that the patch looks in pretty good shape to me. -- Michael rewind_grant.sql Description: application/sql signature.asc Description: PGP signature
Re: INOUT parameters in procedures
On 3/6/18 04:22, Pavel Stehule wrote: > why just OUT variables are disallowed? > > The oracle initializes these values to NULL - we can do same? The problem is function call resolution. If we see a call like CALL foo(a, b, c); the this could be foo() with zero input and three output parameters, or with one input parameter and two output parameters, etc. We have no code to deal with that right now. > Minimally this message is not too friendly, there should be hint - "only > INOUT is suported" - but better support OUT too - from TOP OUT variables > should not be passed. from PL should be required. Added a hint. > I wrote recursive procedure. The call finished by exception. Why? Fixed. (memory context issue) I added your example as a test case. > This issue can be detected in compile time, maybe? > > postgres=# create or replace procedure p(x int,inout a int, inout b numeric) > as $$ > begin raise notice 'xxx % %', a, b;if (x > 1) then > a := x / 10; > b := x / 2; call p(b::int, a, 10); <--- can be detected in compile time? > end if; > end; > $$ language plpgsql; Function resolution doesn't happen at compile time. That would require significant work in PL/pgSQL (possible perhaps, but major work). Right now, we do parse analysis at first execution. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 3c5ed2faab30dfcde34dfd58877e45a7f6477237 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Wed, 7 Mar 2018 19:15:35 -0500 Subject: [PATCH v3] Support INOUT parameters in procedures In a top-level CALL, the values of INOUT parameters will be returned as a result row. In PL/pgSQL, the values are assigned back to the input parameters. In other languages, the same convention as for return a record from a function is used. That does not require any code changes in the PL implementations. --- doc/src/sgml/plperl.sgml | 14 +++ doc/src/sgml/plpgsql.sgml | 16 +++ doc/src/sgml/plpython.sgml | 11 ++ doc/src/sgml/pltcl.sgml| 12 ++ doc/src/sgml/ref/create_procedure.sgml | 5 +- src/backend/catalog/pg_proc.c | 3 +- src/backend/commands/functioncmds.c| 51 +++-- src/backend/tcop/utility.c | 3 +- src/backend/utils/fmgr/funcapi.c | 11 +- src/include/commands/defrem.h | 3 +- src/include/funcapi.h | 3 +- src/pl/plperl/expected/plperl_call.out | 25 + src/pl/plperl/sql/plperl_call.sql | 22 src/pl/plpgsql/src/expected/plpgsql_call.out | 89 +++ .../plpgsql/src/expected/plpgsql_transaction.out | 2 +- src/pl/plpgsql/src/pl_comp.c | 10 +- src/pl/plpgsql/src/pl_exec.c | 125 - src/pl/plpgsql/src/pl_funcs.c | 25 + src/pl/plpgsql/src/pl_gram.y | 38 +-- src/pl/plpgsql/src/pl_scanner.c| 1 + src/pl/plpgsql/src/plpgsql.h | 12 ++ src/pl/plpgsql/src/sql/plpgsql_call.sql| 83 ++ src/pl/plpython/expected/plpython_call.out | 23 src/pl/plpython/plpy_exec.c| 24 ++-- src/pl/plpython/sql/plpython_call.sql | 20 src/pl/tcl/expected/pltcl_call.out | 26 + src/pl/tcl/sql/pltcl_call.sql | 23 src/test/regress/expected/create_procedure.out | 1 + 28 files changed, 632 insertions(+), 49 deletions(-) diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml index cff7a847de..9295c03db9 100644 --- a/doc/src/sgml/plperl.sgml +++ b/doc/src/sgml/plperl.sgml @@ -278,6 +278,20 @@ PL/Perl Functions and Arguments hash will be returned as null values. + + Similarly, output parameters of procedures can be returned as a hash + reference: + + +CREATE PROCEDURE perl_triple(INOUT a integer, INOUT b integer) AS $$ +my ($a, $b) = @_; +return {a => $a * 3, b => $b * 3}; +$$ LANGUAGE plperl; + +CALL perl_triple(5, 10); + + + PL/Perl functions can also return sets of either scalar or composite types. Usually you'll want to return rows one at a diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index c1e3c6a19d..6c25116538 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -1870,6 +1870,22 @@ Returning From a Procedure then NULL must be returned. Returning any other value will result in an error. + + + If a procedure has output parameters, then the output values can be + assigned to the parameters as if they were variables. For example: + +CREATE PROCEDURE triple(INOUT x int) +LANGUAGE plpgsql +AS $$ +BEGIN +x :=
Re: [HACKERS] Subscription code improvements
On Wed, Mar 7, 2018 at 11:13 PM, Alvaro Herrera wrote: > 0001: > > there are a bunch of other messages of the same ilk in the file. I > don't like how the current messages are worded; maybe Peter or Petr had > some reason why they're like that, but I would have split out the reason > for not starting or stopping into errdetail. Something like > > errmsg("logical replication apply worker for subscription \"%s\" will not > start", ...) > errdetail("Subscription has been disabled during startup.") > > But I think we should change all these messages in unison rather than > only one of them. > > Now, your patch changes "will not start" to "will stop". But is that > correct? ISTM that this happens when a worker is starting and > determines that it is not needed. So "will not start" is more correct. > "Will stop" would be correct if the worker had been running for some > time and suddenly decided to terminate, but that doesn't seem to be the > case, unless I misread the code. > > Your patch also changes "disabled during startup" to just "disabled". > Is that a useful change? ISTM that if the subscription had been > disabled prior to startup, then the worker would not have started at > all, so we would not be seeing this message in the first place. Again, > maybe I am misreading the code? Please explain. I think you're not misreading the code. I agree with you. "will not start" and "disabled during startup" would be more appropriate here. But Petr might have other opinion on this. Also I noticed I overlooked one change of v1 patch proposed by Petr. Attached a new patch incorporated your review comment and the hunk I overlooked. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center 0001-Improve-messaging-during-logical-replication-worker-_v3.patch Description: Binary data
Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)
On 03/06/2018 06:23 AM, Yura Sokolov wrote: > 05.03.2018 18:00, Tom Lane пишет: >> Tomas Vondra writes: >>> Snapshots are static (we don't really add new XIDs into existing ones, >>> right?), so why don't we simply sort the XIDs and then use bsearch to >>> lookup values? That should fix the linear search, without need for any >>> local hash table. >> >> +1 for looking into that, since it would avoid adding any complication >> to snapshot copying. In principle, with enough XIDs in the snap, an >> O(1) hash probe would be quicker than an O(log N) bsearch ... but I'm >> dubious that we are often in the range where that would matter. >> We do need to worry about the cost of snapshot copying, too. > > Sorting - is the first thing I've tried. Unfortunately, sorting > itself eats too much cpu. Filling hash table is much faster. > I've been interested in the sort-based approach, so I've spent a bit of time hacking on it (patch attached). It's much less invasive compared to the hash-table, but Yura is right the hashtable gives better results. I've tried to measure the benefits using the same script I shared on Tuesday, but I kept getting really strange numbers. In fact, I've been unable to even reproduce the results I shared back then. And after a bit of head-scratching I think the script is useless - it can't possibly generate snapshots with many XIDs because all the update threads sleep for exactly the same time. And first one to sleep is first one to wake up and commit, so most of the other XIDs are above xmax (and thus not included in the snapshot). I have no idea why I got the numbers :-/ But with this insight, I quickly modified the script to also consume XIDs by another thread (which simply calls txid_current). With that I'm getting snapshots with as many XIDs as there are UPDATE clients (this time I actually checked that using gdb). And for a 60-second run the tps results look like this (see the attached chart, showing the same data): writers master hash sort - 16 1068 1057 1070 32 1005995 1033 64 1064 1042 1117 128 1058 1021 1051 256 977 1004928 512 773935808 768 576815670 1000 413752573 The sort certainly does improve performance compared to master, but it's only about half of the hashtable improvement. I don't know how much this simple workload resembles the YCSB benchmark, but I seem to recall it's touching individual rows. In which case it's likely worse due to the pg_qsort being more expensive than building the hash table. So I agree with Yura the sort is not a viable alternative to the hash table, in this case. > Can I rely on snapshots being static? May be then there is no need > in separate raw representation and hash table. I may try to fill hash > table directly in GetSnapshotData instead of lazy filling. Though it > will increase time under lock, so it is debatable and should be > carefully measured. > Yes, I think you can rely on snapshots not being modified later. That's pretty much the definition of a snapshot. You may do that in GetSnapshotData, but you certainly can't do that in the for loop there. Not only we don't want to add more work there, but you don't know the number of XIDs in that loop. And we don't want to build hash tables for small number of XIDs. One reason against building the hash table in GetSnapshotData is that we'd build it even when the snapshot is never queried. Or when it is queried, but we only need to check xmin/xmax. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index afe1c03..b2dae85 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1758,6 +1758,8 @@ GetSnapshotData(Snapshot snapshot) snapshot->active_count = 0; snapshot->regd_count = 0; snapshot->copied = false; + snapshot->xip_sorted = false; + snapshot->subxip_sorted = false; if (old_snapshot_threshold < 0) { diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c index f7c4c91..f155267 100644 --- a/src/backend/utils/time/tqual.c +++ b/src/backend/utils/time/tqual.c @@ -1461,6 +1461,45 @@ HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin) return TransactionIdPrecedes(HeapTupleHeaderGetRawXmax(tuple), OldestXmin); } +static int +cmp_transaction_id(const void *a, const void *b) +{ + return memcmp(a, b, sizeof(TransactionId)); +} + +/* + * XidInXip searches xid in xip array. + * + * When xcnt is smaller than SnapshotSortThreshold, then the array is unsorted + * and we can simply do a linear search. If there are more elements, the array + * is sorte
RE: Protect syscache from bloating with negative cache entries
From: Alvaro Herrera [mailto:alvhe...@alvh.no-ip.org] > The thing that comes to mind when reading this patch is that some time ago > we made fun of other database software, "they are so complicated to configure, > they have some magical settings that few people understand how to set". > Postgres was so much better because it was simple to set up, no magic crap. > But now it becomes apparent that that only was so because Postgres sucked, > ie., we hadn't yet gotten to the point where we > *needed* to introduce settings like that. Now we finally are? Yes. We are now facing the problem of too much memory use by PostgreSQL, where about some applications randomly access about 200,000 tables. It is estimated based on a small experiment that each backend will use several to ten GBs of local memory for CacheMemoryContext. The total memory use will become over 1 TB when the expected maximum connections are used. I haven't looked at this patch, but does it evict all kinds of entries in CacheMemoryContext, ie. relcache, plancache, etc? Regards Takayuki Tsunakawa
Re: Server won't start with fallback setting by initdb.
On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote: > OK, seems like I'm on the short end of that vote. I propose to push the > GUC-crosschecking patch I posted yesterday, but not the default-value > change, and instead push Kyotaro-san's initdb change. Should we back-patch > these things to v10 where the problem appeared? I would vote for a backpatch. If anybody happens to run initdb on v10 and gets max_connections to 10, that would immediately cause a failure. We could also wait for sombody to actually complain about that, but a bit of prevention does not hurt to ease future user experience on this released version. -- Michael signature.asc Description: PGP signature
Re: Server won't start with fallback setting by initdb.
Robert Haas writes: > On Tue, Mar 6, 2018 at 10:51 PM, Stephen Frost wrote: >> Changing the defaults to go back down strikes me as an entirely wrong >> approach after we've had a release with the higher defaults without >> seriously compelling arguments against, and I don't agree that we've had >> such a case made here. > +1. I don't see any real downside of increasing the minimum value of > max_connections to 20. I wasn't particularly a fan of raising > max_wal_senders to 10, but a lot of other people were, and so far > nobody's reported any problems related to that setting (that I know > about). OK, seems like I'm on the short end of that vote. I propose to push the GUC-crosschecking patch I posted yesterday, but not the default-value change, and instead push Kyotaro-san's initdb change. Should we back-patch these things to v10 where the problem appeared? regards, tom lane
Re: Two-phase update of restart_lsn in LogicalConfirmReceivedLocation
Robert Haas writes: > On Thu, Mar 1, 2018 at 2:03 AM, Craig Ringer wrote: >> So I can't say it's definitely impossible. It seems astonishingly unlikely, >> but that's not always good enough. > Race conditions tend to happen a lot more often than one might think. Just to back that up --- we've seen cases where people could repeatably hit race-condition windows that are just an instruction or two wide. The first one I came to in an idle archive search is https://www.postgresql.org/message-id/15543.1130714273%40sss.pgh.pa.us I vaguely recall others but don't feel like digging harder right now. regards, tom lane
Re: faster testing with symlink installs
Robert Haas writes: > On Wed, Feb 28, 2018 at 9:34 PM, Peter Eisentraut > wrote: >> Except ... this doesn't actually work. find_my_exec() resolves symlinks >> to find the actual program installation, and so for example the >> installed initdb will look for postgres in src/bin/initdb/. I would >> like to revert this behavior, because it seems to do more harm than >> good. The original commit message indicates that this would allow >> symlinking a program to somewhere outside of the installation tree and >> still allow it to work and find its friends. But that could just as >> well be done with a shell script. > My initial gut feeling is that changing this would hurt more people > than it would help. I agree. My recollection is that we expended substantial sweat to make that type of setup work, and I do not think it was for idle reasons. The fact that the behavior is very old doesn't mean it was a bad choice. (Also, the fact that the commit message didn't explain the reasoning in detail is not much of an argument; we didn't go in for that back then.) regards, tom lane
Re: Two-phase update of restart_lsn in LogicalConfirmReceivedLocation
On Thu, Mar 1, 2018 at 2:03 AM, Craig Ringer wrote: > So I can't say it's definitely impossible. It seems astonishingly unlikely, > but that's not always good enough. Race conditions tend to happen a lot more often than one might think. If there's a theoretical opportunity for this to go wrong, it would probably be a good idea to fix it. Not that I'm volunteering. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: unused includes in test_decoding
On Wed, Feb 28, 2018 at 6:59 PM, Euler Taveira wrote: > This is a cosmetic patch that removes some unused includes in > test_decoding. It seems to be like this since day 1 (9.4). Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: public schema default ACL
On Wed, Mar 7, 2018 at 2:48 PM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 3/7/18 10:05, Stephen Frost wrote: > > I liken this to a well-known and well-trodden feature for auto creating > > user home directories on Unix. > > I don't think likening schemas to home directories is really addressing > the most typical use cases. Database contents are for the most part > carefully constructed in a collaborative way. Databases intended to be deployed to production (hopefully) are, but not necessarily those intend to evaluate PostgreSQL's capabilities. > The fix is probably to not let them do that. What is > being discussed here instead is to let them do whatever they want in > their own non-shared spaces. That addresses the security concern, but > it doesn't support the way people actually work right now. > Maybe not the majority of users, but the way DBA's work today is already inherently secure (i.e., not using public) and requires a non-trivial amount of DBA work (i.e., creating groups and users) to make happen. They are not the target audience. The target user profile for this discussion is one who does: sudo apt install postgresql-10 sudo -U postgres createuser myosusername psql myosusername postgres > CREATE TABLE test_table (id serial primary key); > insert into test_table; > select * from test_table; We want to avoid having the create table fail now whereas it worked before we removed create permissions on public from PUBLIC. Now, I'd argue that people aren't bothering to "createuser" in the above but simply skipping to "psql" and then to "sudo -U postgres psql" when they get the error that "user myosusername" doesn't exist...once they start creating new users I'd agree that they likely benefit more from us being conservative and "do only what I say" as opposed to being helpful and doing more stuff in the name of usability. I still feel like I want to mull this over more but auto-creating schemas strikes me as being "spooky action at a distance". David J.
Re: faster testing with symlink installs
On Wed, Feb 28, 2018 at 9:34 PM, Peter Eisentraut wrote: > Except ... this doesn't actually work. find_my_exec() resolves symlinks > to find the actual program installation, and so for example the > installed initdb will look for postgres in src/bin/initdb/. I would > like to revert this behavior, because it seems to do more harm than > good. The original commit message indicates that this would allow > symlinking a program to somewhere outside of the installation tree and > still allow it to work and find its friends. But that could just as > well be done with a shell script. > > Reverting this behavior would also allow Homebrew-like installations to > work better. The actual installation is in a versioned directory like > /usr/local/Cellar/postgresql/10.1/... but symlinked to > /usr/local/opt/postgresql/. But because of the symlink resolution, > calling /usr/local/opt/postgresql/bin/pg_config returns paths under > Cellar, which then causes breakage of software built against that path > on postgresql upgrades the change the version number. My initial gut feeling is that changing this would hurt more people than it would help. It seems entirely reasonable to install PostgreSQL in, say, /opt/local/postgresql, and then symlink psql and pg_ctl into /opt/local/bin or /usr/bin or wherever you like to find your binaries. I don't think we want to break that. It's true that it will work if you symlink everything, as in your Homebrew example, but you might not. TBH I find that Homebrew example pretty odd. I would understand installing each major release in a version directory, but putting every point release in a different versioned directory seems like a bad plan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: public schema default ACL
On 3/7/18 10:05, Stephen Frost wrote: > I liken this to a well-known and well-trodden feature for auto creating > user home directories on Unix. I don't think likening schemas to home directories is really addressing the most typical use cases. Database contents are for the most part carefully constructed in a collaborative way. If your organization has three DBAs foo, bar, and baz, it's quite unlikely that they will want to create or look at objects in schemas named foo, bar, or baz. More likely, they will be interested in the schemas myapp or myotherapp. Or they don't care about schemas and will want the database to behave as if there wasn't a schema layer between the database and the tables. The existing structures are not bad. They work for a lot of users. The problem is just that by default everyone can do whatever they want in a shared space. The fix is probably to not let them do that. What is being discussed here instead is to let them do whatever they want in their own non-shared spaces. That addresses the security concern, but it doesn't support the way people actually work right now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] SERIALIZABLE with parallel query
On Wed, Feb 28, 2018 at 11:35 PM, Thomas Munro wrote: > On Mon, Feb 26, 2018 at 6:37 PM, Thomas Munro > wrote: >> I've now broken it into two patches. > > Rebased. +SerializableXactHandle +ShareSerializableXact(void) +{ +Assert(!IsParallelWorker()); + +return MySerializableXact; +} Uh, how's that OK? There's no rule that you can't create a ParallelContext in a worker. Parallel query currently doesn't, so it probably won't happen, but burying an assertion to that effect in the predicate locking code doesn't seem nice. Is "sxact" really the best (i.e. clearest) name we can come up with for the lock tranche? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: csv format for psql
On Wed, Mar 07, 2018 at 09:37:26PM +0100, Pavel Stehule wrote: > 2018-03-07 21:31 GMT+01:00 Daniel Verite : > > > David Fetter wrote: > > > > > We have some inconsistency here in that fewer table formats are > > > supported, but I think asciidoc, etc., do this correctly via > > > invocations like: > > > > > >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo' > > > > -A is equivalent to -P format=unaligned, so in the above > > invocation, it cancels the effect of -P format=asciidoc. > > Anyway -P format=name on the command line > > is the same as "\pset format name" as a > > metacommand, so it works for all formats. > > > > Some formats (unaligned, html) have corresponding > > command-line options (-A, -H), and others don't. > > In this patch, -C is used so that csv would be in the > > category of formats that can be switched on with the simpler > > invocation on the command line. > > If we don't like that, we can leave out -C for future use > > and let users write -P format=csv. > > That's not the best choice from my POV though, as csv > > is a primary choice to export tabular data. > > > > -C can be used for certificates or some similar. I like csv, but I am not > sure, so it is too important to get short option (the list of free chars > will be only shorter) +1 for not using up a single-letter option for this. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: RFC: Add 'taint' field to pg_control.
On Wed, Feb 28, 2018 at 8:03 PM, Craig Ringer wrote: > A huge +1 from me for the idea. I can't even count the number of black box > "WTF did you DO?!?" servers I've looked at, where bizarre behaviour has > turned out to be down to the user doing something very silly and not saying > anything about it. +1 from me, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Server won't start with fallback setting by initdb.
On Tue, Mar 6, 2018 at 10:51 PM, Stephen Frost wrote: > Changing the defaults to go back down strikes me as an entirely wrong > approach after we've had a release with the higher defaults without > seriously compelling arguments against, and I don't agree that we've had > such a case made here. If this discussion had happened before v10 was > released, I'd be much more open to going with the suggestion of '5', but > forcing users to update their configs for working environments because > we've decided that the default of 10 was too high is just pedantry, in > my opinion. +1. I don't see any real downside of increasing the minimum value of max_connections to 20. I wasn't particularly a fan of raising max_wal_senders to 10, but a lot of other people were, and so far nobody's reported any problems related to that setting (that I know about). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: csv format for psql
2018-03-07 21:31 GMT+01:00 Daniel Verite : > David Fetter wrote: > > > We have some inconsistency here in that fewer table formats are > > supported, but I think asciidoc, etc., do this correctly via > > invocations like: > > > >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo' > > -A is equivalent to -P format=unaligned, so in the above > invocation, it cancels the effect of -P format=asciidoc. > Anyway -P format=name on the command line > is the same as "\pset format name" as a > metacommand, so it works for all formats. > > Some formats (unaligned, html) have corresponding > command-line options (-A, -H), and others don't. > In this patch, -C is used so that csv would be in the > category of formats that can be switched on with the simpler > invocation on the command line. > If we don't like that, we can leave out -C for future use > and let users write -P format=csv. > That's not the best choice from my POV though, as csv > is a primary choice to export tabular data. > -C can be used for certificates or some similar. I like csv, but I am not sure, so it is too important to get short option (the list of free chars will be only shorter) Regards Pavel > > Best regards, > -- > Daniel Vérité > PostgreSQL-powered mailer: http://www.manitou-mail.org > Twitter: @DanielVerite >
Re: FOR EACH ROW triggers on partitioned tables
On Thu, Mar 8, 2018 at 6:17 AM, Alvaro Herrera wrote: > Here's another version of this patch. It is virtually identical to the > previous one, except for a small doc update and whitespace changes. What is this test for? +create trigger failed after update on parted_trig + referencing old table as old_table + for each statement execute procedure trigger_nothing(); It doesn't fail as you apparently expected. Perhaps it was supposed to be "for each row" so you could hit your new error with errdetail("Triggers on partitioned tables cannot have transition tables.")? -- Thomas Munro http://www.enterprisedb.com
Re: csv format for psql
David Fetter wrote: > We have some inconsistency here in that fewer table formats are > supported, but I think asciidoc, etc., do this correctly via > invocations like: > >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo' -A is equivalent to -P format=unaligned, so in the above invocation, it cancels the effect of -P format=asciidoc. Anyway -P format=name on the command line is the same as "\pset format name" as a metacommand, so it works for all formats. Some formats (unaligned, html) have corresponding command-line options (-A, -H), and others don't. In this patch, -C is used so that csv would be in the category of formats that can be switched on with the simpler invocation on the command line. If we don't like that, we can leave out -C for future use and let users write -P format=csv. That's not the best choice from my POV though, as csv is a primary choice to export tabular data. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Re: Protect syscache from bloating with negative cache entries
On Wed, Mar 7, 2018 at 6:01 AM, Alvaro Herrera wrote: > The thing that comes to mind when reading this patch is that some time > ago we made fun of other database software, "they are so complicated to > configure, they have some magical settings that few people understand > how to set". Postgres was so much better because it was simple to set > up, no magic crap. But now it becomes apparent that that only was so > because Postgres sucked, ie., we hadn't yet gotten to the point where we > *needed* to introduce settings like that. Now we finally are? > > I have to admit being a little disappointed about that outcome. I think your disappointment is a little excessive. I am not convinced of the need either for this to have any GUCs at all, but if it makes other people happy to have them, then I think it's worth accepting that as the price of getting the feature into the tree. These are scarcely the first GUCs we have that are hard to tune. work_mem is a terrible knob, and there are probably like very few people who know how to set ssl_ecdh_curve to anything other than the default, and what's geqo_selection_bias good for, anyway? I'm not sure what makes the settings we're adding here any different. Most people will ignore them, and a few people who really care can change the values. > I wonder if this is just because we refuse to acknowledge the notion of > a connection pooler. If we did, and the pooler told us "here, this > session is being given back to us by the application, we'll keep it > around until the next app comes along", could we clean the oldest > inactive cache entries at that point? Currently they use DISCARD for > that. Though this does nothing to fix hypothetical cache bloat for > pg_dump in bug #14936. We could certainly clean the oldest inactive cache entries at that point, but there's no guarantee that would be the right thing to do. If the working set across all applications is small enough that you can keep them all in the caches all the time, then you should do that, for maximum performance. If not, DISCARD ALL should probably flush everything that the last application needed and the next application won't. But without some configuration knob, you have zero way of knowing how concerned the user is about saving memory in this place vs. improving performance by reducing catalog scans. Even with such a knob it's a little difficult to say which things actually ought to be thrown away. I think this is a related problem, but a different one. I also think we ought to have built-in connection pooling. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: planner failure with ProjectSet + aggregation + parallel query
On Mon, Mar 5, 2018 at 10:38 AM, Robert Haas wrote: > While trying to track down a bug today, I found a different bug. > > As of 6946280cded903b6f5269fcce105f8ab1d455d33: > > rhaas=# create table foo (a int); > CREATE TABLE > rhaas=# set min_parallel_table_scan_size = 0; > SET > rhaas=# set parallel_setup_cost = 0; > SET > rhaas=# set parallel_tuple_cost = 0; > SET > rhaas=# select generate_series(1, a) from foo group by a; > ERROR: ORDER/GROUP BY expression not found in targetlist > > Without the SET commands, or without the GROUP BY, or without the SRF, > it successfully constructs a plan. I am able to reproduce this on commit 69f4b9c85f168ae006929eec44fc44d569e846b9 (Move targetlist SRF handling from expression evaluation to new executor node) with the following modification for a GUC rename: create table foo (a int); --set min_parallel_table_scan_size = 0; set min_parallel_relation_size = 0; set parallel_setup_cost = 0; set parallel_tuple_cost = 0; select generate_series(1, a) from foo group by a; But on the previous commit I can't reproduce it. So it looks to me like that's when it got broken. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: csv format for psql
David Fetter wrote: > This seems pretty specialized. If we're adding something new, how about > >psql --format=csv -o foo.csv -c 'TABLE foo' It's a bit easier to memorize than -P format=csv, but psql doesn't have any long option that does not a have a short form with a single letter, and both -F and -f are already taken. Contrary to -C that isn't used until now. > Or we could stick with: > >psql -P format=csv -o foo.csv -c 'TABLE foo' That already works as of the current patch, just like with the other formats. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Re: csv format for psql
I wrote: > a better idea would to have a new \pset fieldsep_csv PFA a v3 patch that implements that, along with regression tests this time. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index bfdf859..8a0e7a1 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -2557,6 +2557,19 @@ lo_import 152801 + fieldsep_csv + + + Specifies the field separator to be used in the csv format. + When the separator appears in a field value, that field + is output inside double quotes according to the csv rules. + To set a tab as field separator, type \pset fieldsep_csv + '\t'. The default is a comma. + + + + + fieldsep_zero @@ -2585,7 +2598,7 @@ lo_import 152801 Sets the output format to one of unaligned, - aligned, wrapped, + aligned, csv, wrapped, html, asciidoc, latex (uses tabular), latex-longtable, or @@ -2597,14 +2610,22 @@ lo_import 152801 unaligned format writes all columns of a row on one line, separated by the currently active field separator. This is useful for creating output that might be intended to be read - in by other programs (for example, tab-separated or comma-separated - format). + in by other programs. aligned format is the standard, human-readable, nicely formatted text output; this is the default. + csv format writes columns separated by + commas, applying the quoting rules described in RFC-4180. + Alternative separators can be selected with \pset fieldsep_csv. + The output is compatible with the CSV format of the COPY + command. The header with column names is output unless the + tuples_only parameter is on. + Title and footers are not printed. + + wrapped format is like aligned but wraps wide data values across lines to make the output fit in the target column width. The target width is determined as described under diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 3560318..c543b1f 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1960,8 +1960,8 @@ exec_command_pset(PsqlScanState scan_state, bool active_branch) int i; static const char *const my_list[] = { - "border", "columns", "expanded", "fieldsep", "fieldsep_zero", - "footer", "format", "linestyle", "null", + "border", "columns", "expanded", "fieldsep", "fieldsep_csv", + "fieldsep_zero", "footer", "format", "linestyle", "null", "numericlocale", "pager", "pager_min_lines", "recordsep", "recordsep_zero", "tableattr", "title", "tuples_only", @@ -3603,6 +3603,9 @@ _align2string(enum printFormat in) case PRINT_TROFF_MS: return "troff-ms"; break; + case PRINT_CSV: + return "csv"; + break; } return "unknown"; } @@ -3674,9 +3677,11 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet) popt->topt.format = PRINT_LATEX_LONGTABLE; else if (pg_strncasecmp("troff-ms", value, vallen) == 0) popt->topt.format = PRINT_TROFF_MS; + else if (pg_strncasecmp("csv", value, vallen) == 0) + popt->topt.format = PRINT_CSV; else { - psql_error("\\pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n"); + psql_error("\\pset: allowed formats are unaligned, aligned, csv, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n"); return false; } } @@ -3804,6 +3809,15 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet) } } + else if (strcmp(param, "fieldsep_csv") == 0) + { + if (value) + { + free(popt->topt.fieldSepCsv); + popt->topt.fieldSepCsv = pg_strdup(value); + } + } + else if (strcmp(param, "fieldsep_zero") == 0) { free(popt->topt.fieldSep.separator); @@ -3959,6 +3973,13 @@ printP
Re: csv format for psql
2018-03-07 20:25 GMT+01:00 David Fetter : > On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote: > > > > >> psql --csv -c 'TABLE foo' > foo.csv > > >> > > >>With a -c to introduce the command. > > > > > >This seems pretty specialized. If we're adding something new, how about > > > > > > psql --format=csv -o foo.csv -c 'TABLE foo' > > > > > >Or we could stick with: > > > > > > psql -P format=csv -o foo.csv -c 'TABLE foo' > > > > Currently "-P format=csv" uses the unaligned formating separators, i.e. > '|' > > is used. I was suggesting that a special long option could switch several > > variables to some specific values, i.e. > > > > --csv > > > > Would be equivalent to something like: > > > > -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ... > > We have some inconsistency here in that fewer table formats are > supported, but I think asciidoc, etc., do this correctly via > invocations like: > > psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo' > > > I.e. really generate some csv from the data in just one option, not many. > > > > But this is obviously debatable. > > I suspect we'll get requests for an all-JSON option, HTML tables, > etc., assuming we don't have them already. > > I'm hoping we can have that all in one framework. I get that setting > each of tuples_only, fieldsep, recordsep, etc. might be a bit of a > lift for some users, but it's not clear how we'd make a sane default > that made choices among those correct for enough users. > > For example, do we know that we want tuples_only behavior by default? > A lot of people's CSV tools assume a header row. > I am for default header - it can be disabled by -t option Pavel > > Best, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate >
Re: csv format for psql
On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote: > > >> psql --csv -c 'TABLE foo' > foo.csv > >> > >>With a -c to introduce the command. > > > >This seems pretty specialized. If we're adding something new, how about > > > > psql --format=csv -o foo.csv -c 'TABLE foo' > > > >Or we could stick with: > > > > psql -P format=csv -o foo.csv -c 'TABLE foo' > > Currently "-P format=csv" uses the unaligned formating separators, i.e. '|' > is used. I was suggesting that a special long option could switch several > variables to some specific values, i.e. > > --csv > > Would be equivalent to something like: > > -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ... We have some inconsistency here in that fewer table formats are supported, but I think asciidoc, etc., do this correctly via invocations like: psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo' > I.e. really generate some csv from the data in just one option, not many. > > But this is obviously debatable. I suspect we'll get requests for an all-JSON option, HTML tables, etc., assuming we don't have them already. I'm hoping we can have that all in one framework. I get that setting each of tuples_only, fieldsep, recordsep, etc. might be a bit of a lift for some users, but it's not clear how we'd make a sane default that made choices among those correct for enough users. For example, do we know that we want tuples_only behavior by default? A lot of people's CSV tools assume a header row. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Fix missing spaces in docs
Hi all, The attached patch just fix missing spaces in documentation of CREATE SERVER and CREATE USER MAPPING. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog: http://fabriziomello.github.io >> Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello >> Github: http://github.com/fabriziomello diff --git a/doc/src/sgml/ref/create_server.sgml b/doc/src/sgml/ref/create_server.sgml index eb4ca89..af0a7a0 100644 --- a/doc/src/sgml/ref/create_server.sgml +++ b/doc/src/sgml/ref/create_server.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -CREATE SERVER [IF NOT EXISTS] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ] +CREATE SERVER [ IF NOT EXISTS ] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ] FOREIGN DATA WRAPPER fdw_name [ OPTIONS ( option 'value' [, ... ] ) ] diff --git a/doc/src/sgml/ref/create_user_mapping.sgml b/doc/src/sgml/ref/create_user_mapping.sgml index c2f5278..9719a4f 100644 --- a/doc/src/sgml/ref/create_user_mapping.sgml +++ b/doc/src/sgml/ref/create_user_mapping.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -CREATE USER MAPPING [IF NOT EXISTS] FOR { user_name | USER | CURRENT_USER | PUBLIC } +CREATE USER MAPPING [ IF NOT EXISTS ] FOR { user_name | USER | CURRENT_USER | PUBLIC } SERVER server_name [ OPTIONS ( option 'value' [ , ... ] ) ]
Re: csv format for psql
psql --csv -c 'TABLE foo' > foo.csv With a -c to introduce the command. This seems pretty specialized. If we're adding something new, how about psql --format=csv -o foo.csv -c 'TABLE foo' Or we could stick with: psql -P format=csv -o foo.csv -c 'TABLE foo' Currently "-P format=csv" uses the unaligned formating separators, i.e. '|' is used. I was suggesting that a special long option could switch several variables to some specific values, i.e. --csv Would be equivalent to something like: -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ... I.e. really generate some csv from the data in just one option, not many. But this is obviously debatable. -- Fabien.
Re: parallel append vs. simple UNION ALL
On Wed, Mar 7, 2018 at 5:36 AM, Rajkumar Raghuwanshi wrote: > With 0001 applied on PG-head, I got reference leak warning and later a > server crash. > this crash is reproducible with enable_parallel_append=off also. > below is the test case to reproduce this. New patches attached, fixing all 3 of the issues you reported: 0001 is a new patch to fix the incorrect parallel safety marks on upper relations. I don't know of a visible effect of this patch by itself, but there might be one. 0002 is the same as the old 0001, but I made a fix in SS_charge_for_initplans() which fixed your most recent crash report. Either this or the previous change also fixed the crash you saw when using tab-completion. Also, I added some test cases based on your failing examples. 0003-0005 are the same as the old 0002-0004. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 0005-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patch Description: Binary data 0004-Generate-a-separate-upper-relation-for-each-stage-of.patch Description: Binary data 0003-Rewrite-recurse_union_children-to-iterate-rather-tha.patch Description: Binary data 0002-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patch Description: Binary data 0001-Correctly-assess-parallel-safety-of-tlists-when-SRFs.patch Description: Binary data
Re: pgbench's expression parsing & negative numbers
Hello Andres, working on overflow correctness in pg I noticed that pgbench isn't quite there. Indeed. I assume it won't matter terribly often, but the way it parses integers makes it incorrect for, at least, the negativemost number. [...] but that unfortunately means that the sign is no included in the call to strtoint64. The strtoint64 function is indeed a mess. It does not really report errors (more precisely, an error message is printed out, but the execution goes on nevertheless...). Which in turn means you can't correctly parse PG_INT64_MIN, because that's not representable as a positive number on a two's complement machine (i.e. everywhere). Preliminary testing shows that that can relatively easily fixed by just prefixing [-+]? to the relevant exprscan.l rules. I'm not sure I get it, because then "1 -2" would be scanned as "int signed_int", which requires to add some fine rules to the parser to handle that as an addition, and I would be unclear about the priority handling, eg "1 -2 * 3". But I agree that it can be made to work with transfering the conversion to the parser and maybe some careful tweaking there. But it might also not be a bad idea to simply defer parsing of the number until exprparse.y has its hand on the number? There's plenty other things wrong with overflow handling too, especially evalFunc() basically just doesn't care. Indeed. Some overflow issues are easy to fix with the existing pg_*_s64_overflow macros that you provided. It should also handle double2int64 overflows. I have tried to trigger errors with a -fsanitize=signed-integer-overflow compilation, without much success, but ISTM that you suggested that pgbench does not work with that in another thread. Could you be more precise? I'll come up with a patch for that sometime soon. ISTM that you have not sent any patch on the subject, otherwise I would have reviewed it. Maybe I could do one some time later, unless you think that I should not. -- Fabien.
Re: csv format for psql
On Wed, Mar 07, 2018 at 07:40:49PM +0100, Fabien COELHO wrote: > > Hello Pavel, > > >>psql --csv 'TABLE Stuff;' > stuff.csv > > > >There is commad -c and it should be used. The --csv options should not to > >have a parameter. I don't like a idea to have more options for query > >execution. > > Yes, I agree and that is indeed what I meant, sorry for the typo. The > cleaner example would be something like: > > psql --csv -c 'TABLE foo' > foo.csv > > With a -c to introduce the command. This seems pretty specialized. If we're adding something new, how about psql --format=csv -o foo.csv -c 'TABLE foo' Or we could stick with: psql -P format=csv -o foo.csv -c 'TABLE foo' Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: csv format for psql
2018-03-07 19:40 GMT+01:00 Fabien COELHO : > > Hello Pavel, > > psql --csv 'TABLE Stuff;' > stuff.csv >>> >> >> There is commad -c and it should be used. The --csv options should not to >> have a parameter. I don't like a idea to have more options for query >> execution. >> > > Yes, I agree and that is indeed what I meant, sorry for the typo. The > cleaner example would be something like: > > psql --csv -c 'TABLE foo' > foo.csv > > With a -c to introduce the command. > ok :) it has sense now Regards Pavel > > -- > Fabien. >
Re: csv format for psql
Hello Pavel, psql --csv 'TABLE Stuff;' > stuff.csv There is commad -c and it should be used. The --csv options should not to have a parameter. I don't like a idea to have more options for query execution. Yes, I agree and that is indeed what I meant, sorry for the typo. The cleaner example would be something like: psql --csv -c 'TABLE foo' > foo.csv With a -c to introduce the command. -- Fabien.
Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT
Em 3 de mar de 2018 19:32, "Peter Eisentraut" < peter.eisentr...@2ndquadrant.com> escreveu: On 2/20/18 10:10, Matheus de Oliveira wrote: > Besides that, there is a another change in this patch on current ALTER > CONSTRAINT about deferrability options. Previously, if the user did > ALTER CONSTRAINT without specifying an option on deferrable or > initdeferred, it was implied the default options, so this: > > ALTER TABLE tbl > ALTER CONSTRAINT con_name; > > Was equivalent to: > > ALTER TABLE tbl > ALTER CONSTRAINT con_name NOT DEFERRABLE INITIALLY IMMEDIATE; Oh, that seems wrong. Probably, it shouldn't even accept that syntax with an empty options list, let alone reset options that are not mentioned. Yeah, it felt really weird when I noticed it. And I just noticed while reading the source. Can you prepare a separate patch for this issue? I can do that, no problem. It'll take awhile though, I'm on a trip and will be home around March 20th. You think this should be applied to all versions that support ALTER CONSTRAINT, right? Thanks. Best regards,
Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT
Em 2 de mar de 2018 08:15, "Andres Freund" escreveu: Hi, On 2018-02-20 12:10:22 -0300, Matheus de Oliveira wrote: > I attached a patch to add support for changing ON UPDATE/DELETE actions of > a constraint using ALTER TABLE ... ALTER CONSTRAINT. This patch has been submitted to the last commitfest for v11 and is not a trivial patch. As we don't accept such patches this late, it should be moved to the next fest. Any arguments against? Sorry. My bad. I'm OK with sending this to the next one. Best regards,
Re: Protect syscache from bloating with negative cache entries
On 2018-03-07 14:48:48 -0300, Alvaro Herrera wrote: > Oh, I wasn't suggesting to throw away the whole cache at that point; > only that that is a convenient to do whatever cleanup we want to do. But why is that better than doing so continuously? > What I'm not clear about is exactly what is the cleanup that we want to > do at that point. You say it should be based on some configured size; > Robert says any predefined size breaks [performance for] the case where > the workload uses size+1, so let's use time instead (evict anything not > used in more than X seconds?), but keeping in mind that a workload that > requires X+1 would also break. We mostly seem to have found that adding a *minimum* size before starting evicting basedon time solves both of our concerns? > So it seems we've arrived at the > conclusion that the only possible solution is to let the user tell us > what time/size to use. But that sucks, because the user doesn't know > either (maybe they can measure, but how?), and they don't even know that > this setting is there to be tweaked; and if there is a performance > problem, how do they figure whether or not it can be fixed by fooling > with this parameter? I mean, maybe it's set to 10 and we suggest "maybe > 11 works better" but it turns out not to, so "maybe 12 works better"? > How do you know when to stop increasing it? I don't think it's that complicated, for the size figure. Having a knob that controls how much memory a backend uses isn't a new concept, and can definitely depend on the usecase. > This seems a bit like max_fsm_pages, that is to say, a disaster that was > only fixed by removing it. I don't think that's a meaningful comparison. max_fms_pages had persistent effect, couldn't be tuned without restarts, and the performance dropoffs were much more "cliff" like. Greetings, Andres Freund
Re: Protect syscache from bloating with negative cache entries
Hello, Andres Freund wrote: > On 2018-03-07 08:01:38 -0300, Alvaro Herrera wrote: > > I wonder if this is just because we refuse to acknowledge the notion of > > a connection pooler. If we did, and the pooler told us "here, this > > session is being given back to us by the application, we'll keep it > > around until the next app comes along", could we clean the oldest > > inactive cache entries at that point? Currently they use DISCARD for > > that. Though this does nothing to fix hypothetical cache bloat for > > pg_dump in bug #14936. > > I'm not seeing how this solves anything? You don't want to throw all > caches away, therefore you need a target size. Then there's also the > case of the cache being too large in a single "session". Oh, I wasn't suggesting to throw away the whole cache at that point; only that that is a convenient to do whatever cleanup we want to do. What I'm not clear about is exactly what is the cleanup that we want to do at that point. You say it should be based on some configured size; Robert says any predefined size breaks [performance for] the case where the workload uses size+1, so let's use time instead (evict anything not used in more than X seconds?), but keeping in mind that a workload that requires X+1 would also break. So it seems we've arrived at the conclusion that the only possible solution is to let the user tell us what time/size to use. But that sucks, because the user doesn't know either (maybe they can measure, but how?), and they don't even know that this setting is there to be tweaked; and if there is a performance problem, how do they figure whether or not it can be fixed by fooling with this parameter? I mean, maybe it's set to 10 and we suggest "maybe 11 works better" but it turns out not to, so "maybe 12 works better"? How do you know when to stop increasing it? This seems a bit like max_fsm_pages, that is to say, a disaster that was only fixed by removing it. Thanks, -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Protect syscache from bloating with negative cache entries
Hi, On 2018-03-07 08:01:38 -0300, Alvaro Herrera wrote: > I wonder if this is just because we refuse to acknowledge the notion of > a connection pooler. If we did, and the pooler told us "here, this > session is being given back to us by the application, we'll keep it > around until the next app comes along", could we clean the oldest > inactive cache entries at that point? Currently they use DISCARD for > that. Though this does nothing to fix hypothetical cache bloat for > pg_dump in bug #14936. I'm not seeing how this solves anything? You don't want to throw all caches away, therefore you need a target size. Then there's also the case of the cache being too large in a single "session". Greetings, Andres Freund
Re: GSoC 2017: weekly progress reports (week 6)
Hi, On 2018-03-07 11:58:51 -0300, Alvaro Herrera wrote: > > This appears to be a duplicate of > > https://commitfest.postgresql.org/17/1466/ - as the other one is older, I'm > > closing this one. > > This comment makes no sense from the POV of the mail archives. I had to > look at the User-Agent in your email to realize that you wrote it in the > commitfest app. Yea, I stopped doing so afterwards... > 1. impersonating the "From:" header is a bad idea; needs fixed much as >we did with the bugs and doc comments submission forms > 2. it should have had a header indicating it comes from CF app > 3. it would be great to include in said header a link to the CF entry >to which the comment was attached. Sounds reasonable. Greetings, Andres Freund
Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
I suggest to create a new function GinPredicateLockPage() that checks whether fast update is enabled for the index. The current arrangement looks too repetitive and it seems easy to make a mistake. Stylistically, please keep #include lines ordered alphabetically, and cut long lines to below 80 chars. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: FOR EACH ROW triggers on partitioned tables
Here's another version of this patch. It is virtually identical to the previous one, except for a small doc update and whitespace changes. To recap: when a row-level trigger is created on a partitioned table, it is marked tginherits; partitions all have their pg_class row modified with relhastriggers=true. No clone of the pg_trigger row is created for the partitions. Instead, when the relcache entry for the partition is created, pg_trigger is scanned to look for entries for its ancestors. So the trigger list for a partition is created by repeatedly scanning pg_trigger and pg_inherits, until only entries with relhastriggers=f are found. I reserve the right to revise this further, as I'm going to spend a couple of hours looking at it this afternoon, particularly to see how concurrent DDL behaves, but I don't see anything obviously wrong with it. Robert Haas wrote: > Elsewhere, we've put a lot of blood, sweat, and tears into making sure > that we only traverse the inheritance hierarchy from top to bottom. > Otherwise, we're adding deadlock hazards. I think it's categorically > unacceptable to do traversals in the opposite order -- if you do, then > an UPDATE on a child could deadlock with a LOCK TABLE on the parent. > That will not win us any awards. We don't actually open relations or acquire locks in the traversal I was talking about, though; the only thing we do is scan pg_trigger using first the partition relid, then seek the ancestor(s) by scanning pg_inherits and recurse. We don't acquire locks on the involved relations, so there should be no danger of deadlocks. Changes in the definitions ought to be handled by the cache invalidations that are sent, although I admit to not having tested this specifically. I'll do that later today. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index a0e6d7062b..4887878eec 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1873,7 +1873,9 @@ SCRAM-SHA-256$:&l True if table has (or once had) triggers; see - pg_trigger catalog + pg_trigger catalog. + If this is a partition, triggers on its partitioned ancestors are also + considered @@ -6988,6 +6990,13 @@ SCRAM-SHA-256$ :&l + tginherits + bool + + True if trigger applies to children relations too + + + tgnargs int2 diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml index 3d6b9f033c..901264c6d2 100644 --- a/doc/src/sgml/ref/create_trigger.sgml +++ b/doc/src/sgml/ref/create_trigger.sgml @@ -504,7 +504,9 @@ UPDATE OF column_name1 [, column_name2REFERENCING clause, then before and after images of rows are visible from all affected partitions or child tables. diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y index ed7a55596f..7ad0126df5 100644 --- a/src/backend/bootstrap/bootparse.y +++ b/src/backend/bootstrap/bootparse.y @@ -230,6 +230,7 @@ Boot_CreateStmt: RELPERSISTENCE_PERMANENT, shared_relation, mapped_relation, + false, true); elog(DEBUG4, "bootstrap relation created"); } @@ -252,6 +253,7 @@ Boot_CreateStmt: mapped_relation, true, 0, + false, ONCOMMIT_NOOP, (Datum) 0, false, diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index cf36ce4add..815f371ac2 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -257,6 +257,7 @@ heap_create(const c
Re: public schema default ACL
Greetings Petr, * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 07/03/18 17:55, Stephen Frost wrote: > > Greetings Petr, all, > > > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > >> On 07/03/18 13:14, Stephen Frost wrote: > >>> * Noah Misch (n...@leadboat.com) wrote: > On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote: > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > >> I wonder whether it'd be sensible for CREATE USER --- or at least the > >> createuser script --- to automatically make a matching schema. Or we > >> could just recommend that DBAs do so. Either way, we'd be pushing > >> people > >> towards the design where "$user" does exist for most/all users. Our > >> docs > >> comment (section 5.8.7) that "the concepts of schema and user are > >> nearly > >> equivalent in a database system that implements only the basic schema > >> support specified in the standard", so the idea of automatically making > >> a schema per user doesn't seem ridiculous on its face. (Now, where'd I > >> put my flameproof long johns ...) > > > > You are not the first to think of this in recent days, and I'm hopeful > > to see others comment in support of this idea. For my 2c, I'd suggest > > that what we actually do is have a new role attribute which is "when > > this user connects to a database, if they don't have a schema named > > after their role, then create one." Creating the role at CREATE ROLE > > time would only work for the current database, after all (barring some > > other magic that allows us to create schemas in all current and future > > databases...). > > I like the idea of getting more SQL-compatible, if this presents a > distinct > opportunity to do so. I do think it would be too weird to create the > schema > in one database only. Creating it on demand might work. What would be > the > procedure, if any, for database owners who want to deny object creation > in > their databases? > >>> > >>> My suggestion was that this would be a role attribute. If an > >>> administrator doesn't wish for that role to have a schema created > >>> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever > >>> we name it) role attribute to false. > >>> > >> Yeah I think role attribute makes sense, it's why I suggested something > >> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can > >> point the schema to public for example) and also the fact that $user > >> schema which is first in search_path might or might not exist. > > > > What I dislike about this proposal is that it seems to conflate two > > things- if the schema will be created for the user automatically or not, > > and what the search_path setting is. > > Well, what $user in search_path resolves to rather than what search_path is. Alright, that makes a bit more sense to me. I had thought you were suggesting we would just combine these two pieces to make up the "real" search path, which I didn't like. Having it replace what $user is in the search_path would be a bit confusing, I think. Perhaps having a new '$default' would be alright though I'm still having a bit of trouble imagining the use-case and it seems like we'd probably still keep the "wil this schema be created automatically or not" seperate from this new search path variable. > > Those are two different things and > > I don't think we should mix them. > > I guess I am missing the point of the schema creation for user then if > it's not also automatically the default schema for that user. With our default search path being $user, public, it would be... Thanks! Stephen
Re: public schema default ACL
On 07/03/18 17:55, Stephen Frost wrote: > Greetings Petr, all, > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> On 07/03/18 13:14, Stephen Frost wrote: >>> * Noah Misch (n...@leadboat.com) wrote: On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> I wonder whether it'd be sensible for CREATE USER --- or at least the >> createuser script --- to automatically make a matching schema. Or we >> could just recommend that DBAs do so. Either way, we'd be pushing people >> towards the design where "$user" does exist for most/all users. Our docs >> comment (section 5.8.7) that "the concepts of schema and user are nearly >> equivalent in a database system that implements only the basic schema >> support specified in the standard", so the idea of automatically making >> a schema per user doesn't seem ridiculous on its face. (Now, where'd I >> put my flameproof long johns ...) > > You are not the first to think of this in recent days, and I'm hopeful > to see others comment in support of this idea. For my 2c, I'd suggest > that what we actually do is have a new role attribute which is "when > this user connects to a database, if they don't have a schema named > after their role, then create one." Creating the role at CREATE ROLE > time would only work for the current database, after all (barring some > other magic that allows us to create schemas in all current and future > databases...). I like the idea of getting more SQL-compatible, if this presents a distinct opportunity to do so. I do think it would be too weird to create the schema in one database only. Creating it on demand might work. What would be the procedure, if any, for database owners who want to deny object creation in their databases? >>> >>> My suggestion was that this would be a role attribute. If an >>> administrator doesn't wish for that role to have a schema created >>> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever >>> we name it) role attribute to false. >>> >> Yeah I think role attribute makes sense, it's why I suggested something >> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can >> point the schema to public for example) and also the fact that $user >> schema which is first in search_path might or might not exist. > > What I dislike about this proposal is that it seems to conflate two > things- if the schema will be created for the user automatically or not, > and what the search_path setting is. Well, what $user in search_path resolves to rather than what search_path is. > Those are two different things and > I don't think we should mix them. I guess I am missing the point of the schema creation for user then if it's not also automatically the default schema for that user. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: public schema default ACL
Greetings Petr, all, * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 07/03/18 13:14, Stephen Frost wrote: > > * Noah Misch (n...@leadboat.com) wrote: > >> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote: > >>> * Tom Lane (t...@sss.pgh.pa.us) wrote: > I wonder whether it'd be sensible for CREATE USER --- or at least the > createuser script --- to automatically make a matching schema. Or we > could just recommend that DBAs do so. Either way, we'd be pushing people > towards the design where "$user" does exist for most/all users. Our docs > comment (section 5.8.7) that "the concepts of schema and user are nearly > equivalent in a database system that implements only the basic schema > support specified in the standard", so the idea of automatically making > a schema per user doesn't seem ridiculous on its face. (Now, where'd I > put my flameproof long johns ...) > >>> > >>> You are not the first to think of this in recent days, and I'm hopeful > >>> to see others comment in support of this idea. For my 2c, I'd suggest > >>> that what we actually do is have a new role attribute which is "when > >>> this user connects to a database, if they don't have a schema named > >>> after their role, then create one." Creating the role at CREATE ROLE > >>> time would only work for the current database, after all (barring some > >>> other magic that allows us to create schemas in all current and future > >>> databases...). > >> > >> I like the idea of getting more SQL-compatible, if this presents a distinct > >> opportunity to do so. I do think it would be too weird to create the > >> schema > >> in one database only. Creating it on demand might work. What would be the > >> procedure, if any, for database owners who want to deny object creation in > >> their databases? > > > > My suggestion was that this would be a role attribute. If an > > administrator doesn't wish for that role to have a schema created > > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever > > we name it) role attribute to false. > > > Yeah I think role attribute makes sense, it's why I suggested something > like DEFAULT_SCHEMA, that seems to address both schema creation (dba can > point the schema to public for example) and also the fact that $user > schema which is first in search_path might or might not exist. What I dislike about this proposal is that it seems to conflate two things- if the schema will be created for the user automatically or not, and what the search_path setting is. Those are two different things and I don't think we should mix them. > Question would be what happens if schema is then explicitly dropper (in > either case). I'm not sure that I see an issue with that- if it's dropped then it gets recreated when that user logs back in. The systems I'm aware of, as best as I can recall, didn't have any particular check or explicit additional behavior for such a case. Thanks! Stephen
Re: public schema default ACL
Greetings Petr, all, * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 07/03/18 16:26, Stephen Frost wrote: > > Greeting Petr, all, > > > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > >> On 07/03/18 13:18, Stephen Frost wrote: > >>> Greetings, > >>> > >>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > Certain "market leader" database behaves this way as well. I just hope > we won't go as far as them and also create users for schemas (so that > the analogy of user=schema would be complete and working both ways). > Because that's one of the main reasons their users depend on packages so > much, there is no other way to create a namespace without having to deal > with another user which needs to be secured. > >>> > >>> I agree that we do *not* want to force role creation on schema creation. > >>> > One thing we could do to limit impact of any of this is having > DEFAULT_SCHEMA option for roles which would then be the first one in the > search_path (it could default to the role name), that way making public > schema work again for everybody would be just about tweaking the roles a > bit which can be easily scripted. > >>> > >>> I don't entirely get what you're suggesting here considering we already > >>> have $user, and it is the first in the search_path..? > >>> > >> > >> What I am suggesting is that we add option to set user's default schema > >> to something other than user name so that if people don't want the > >> schema with the name of the user auto-created, it won't be. > > > > We have ALTER USER joe SET search_path already though..? And ALTER > > DATABASE, and in postgresql.conf? What are we missing? > > That will not change the fact that we have created schema joe for that > user though. While something like CREATE USER joe DEFAULT_SCHEMA foobar > would. > > My point is that I don't mind if we create schemas for users by default, > but I want simple way to opt out. Oh, yes, we would definitely need an opt-out mechanism. It's unclear to me what adding a 'default schema' role option would do though that's different from setting the search_path for a user. I certainly wouldn't expect it to create a new schema > > opportunity to do so. I do think it would be too weird to create the > > schema > > in one database only. Creating it on demand might work. What would be > > the > > procedure, if any, for database owners who want to deny object creation > > in > > their databases? > > Well, REVOKE CREATE ON DATABASE already exists. > >>> > >>> That really isn't the same.. In this approach, regular roles are *not* > >>> given the CREATE right on the database, the system would just create the > >>> schema for them on login automatically if the role attribute says to do > >>> so. > >> > >> What's the point of creating schema for them if they don't have CREATE > >> privilege? > > > > They would own the schema and therefore have CREATE and USAGE rights on > > the schema itself. Creating objects checks for schema rights, it > > doesn't check for database rights- that's only if you're creating > > schemas. > > > > Yes, but should the schema for them be created at all if they don't have > CREATE privilege on the database? If yes then I have same question as > Noah, how does dba prevent object creation in their databases? Yes, the schema would be created regardless of the rights of the user on the database, because the admin set the flag on the role saying 'create a schema for this user when they log in.' If we think there is a use-case for saying "this user should only have schemas in these databases, not all databases" then I could see having the role attribute be a list of databases or "all", instead. In the end, I do think this is something which is controlled at the role level and not something an individual database owner could override or prevent, though perhaps there is some room for discussion there. What I don't want is for this feature to *depend* on the users having CREATE rights on the database, as that would allow them to create other schemas (perhaps even one which is named the same as a likely new user whose account hasn't been created yet or they haven't logged in yet...). Thanks! Stephen
Re: Change RangeVarGetRelidExtended() to take flags argument?
On 3/5/18, 7:08 PM, "Andres Freund" wrote: > On 2018-03-05 19:57:44 -0500, Tom Lane wrote: >> Andres Freund writes: >>> One wrinkle in that plan is that it'd not be trivial to discern whether >>> a lock couldn't be acquired or whether the object vanished. I don't >>> really have good idea how to tackle that yet. >> Do we really care which case applies? > > I think there might be cases where we do. As expand_vacuum_rel() > wouldn't use missing_ok = true, it'd not be an issue for this specific > callsite though. I think it might be enough to simply note the ambiguity of returning InvalidOid when skip-locked and missing-ok are both specified. Even if RangeVarGetRelidExtended() did return whether skip-locked or missing-ok applied, such a caller would likely not be able to trust it anyway, as no lock would be held. Nathan
Re: public schema default ACL
On 07/03/18 13:14, Stephen Frost wrote: > Greetings, > > * Noah Misch (n...@leadboat.com) wrote: >> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote: >>> * Tom Lane (t...@sss.pgh.pa.us) wrote: I wonder whether it'd be sensible for CREATE USER --- or at least the createuser script --- to automatically make a matching schema. Or we could just recommend that DBAs do so. Either way, we'd be pushing people towards the design where "$user" does exist for most/all users. Our docs comment (section 5.8.7) that "the concepts of schema and user are nearly equivalent in a database system that implements only the basic schema support specified in the standard", so the idea of automatically making a schema per user doesn't seem ridiculous on its face. (Now, where'd I put my flameproof long johns ...) >>> >>> You are not the first to think of this in recent days, and I'm hopeful >>> to see others comment in support of this idea. For my 2c, I'd suggest >>> that what we actually do is have a new role attribute which is "when >>> this user connects to a database, if they don't have a schema named >>> after their role, then create one." Creating the role at CREATE ROLE >>> time would only work for the current database, after all (barring some >>> other magic that allows us to create schemas in all current and future >>> databases...). >> >> I like the idea of getting more SQL-compatible, if this presents a distinct >> opportunity to do so. I do think it would be too weird to create the schema >> in one database only. Creating it on demand might work. What would be the >> procedure, if any, for database owners who want to deny object creation in >> their databases? > > My suggestion was that this would be a role attribute. If an > administrator doesn't wish for that role to have a schema created > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever > we name it) role attribute to false. > Yeah I think role attribute makes sense, it's why I suggested something like DEFAULT_SCHEMA, that seems to address both schema creation (dba can point the schema to public for example) and also the fact that $user schema which is first in search_path might or might not exist. Question would be what happens if schema is then explicitly dropper (in either case). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: public schema default ACL
On 07/03/18 16:26, Stephen Frost wrote: > Greeting Petr, all, > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> On 07/03/18 13:18, Stephen Frost wrote: >>> Greetings, >>> >>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: Certain "market leader" database behaves this way as well. I just hope we won't go as far as them and also create users for schemas (so that the analogy of user=schema would be complete and working both ways). Because that's one of the main reasons their users depend on packages so much, there is no other way to create a namespace without having to deal with another user which needs to be secured. >>> >>> I agree that we do *not* want to force role creation on schema creation. >>> One thing we could do to limit impact of any of this is having DEFAULT_SCHEMA option for roles which would then be the first one in the search_path (it could default to the role name), that way making public schema work again for everybody would be just about tweaking the roles a bit which can be easily scripted. >>> >>> I don't entirely get what you're suggesting here considering we already >>> have $user, and it is the first in the search_path..? >>> >> >> What I am suggesting is that we add option to set user's default schema >> to something other than user name so that if people don't want the >> schema with the name of the user auto-created, it won't be. > > We have ALTER USER joe SET search_path already though..? And ALTER > DATABASE, and in postgresql.conf? What are we missing? That will not change the fact that we have created schema joe for that user though. While something like CREATE USER joe DEFAULT_SCHEMA foobar would. My point is that I don't mind if we create schemas for users by default, but I want simple way to opt out. > > opportunity to do so. I do think it would be too weird to create the > schema > in one database only. Creating it on demand might work. What would be > the > procedure, if any, for database owners who want to deny object creation in > their databases? Well, REVOKE CREATE ON DATABASE already exists. >>> >>> That really isn't the same.. In this approach, regular roles are *not* >>> given the CREATE right on the database, the system would just create the >>> schema for them on login automatically if the role attribute says to do >>> so. >> >> What's the point of creating schema for them if they don't have CREATE >> privilege? > > They would own the schema and therefore have CREATE and USAGE rights on > the schema itself. Creating objects checks for schema rights, it > doesn't check for database rights- that's only if you're creating > schemas. > Yes, but should the schema for them be created at all if they don't have CREATE privilege on the database? If yes then I have same question as Noah, how does dba prevent object creation in their databases? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [GSOC 18] Discussion on the datatable
Hi Hongyuan, On Tue, Mar 06, 2018 at 01:36:23AM +0800, Hongyuan Ma wrote: > Hi Mark, > In the past few days I have read some code in pgperffarm.git repository.I > look forward to discussing the project in detail with you and gradually > defining the datasheet structure and refining the requirements. Here are some > of my ideas, if there are any errors or deficiencies, please feel free to > correct me. > > > To create a datasheet: pg_perf_cate > Overview: > pg_perf_cate table is used to store performance test project categories that > support multi-level categories. > > > Description: > In line 12 of the "pgperffarm \ client \ benchmarks \ runner.py" file there > is a line like this: > > > '' > 'manages runs of all the benchmarks, including cluster restarts etc.' > '' > > > In my imagination, there may be more items of performance tests than build > tests. Based on the above comments, I guess, for example, may be there are > "single instance of performance tests","cluster performance tests", "other > performance tests" three major categories. Each major category also contains > their own test sub-categories, such as addition tests and deletion tests and > so on. In the pg_perf_cate table, the cate_pid field indicates the parent > category of the current test category. If the pid field of a row of data has > a value of 0, the row represents the top-level category. > > > Related Functions: > - Maybe in the navigation bar we can create a category menu to help users > quickly find their own interest in the test items (similar to the Amazon Mall > product categories). The cate_order field is used to manage the order of the > categories in the current level for easy front-end menu display. > - In the admin back-end need a page which can add or modify categories. > - > To create a datasheet: pg_perf_test > Overview: > The pg_perf_test table is used to store specific test items, including the > test item number(test_id), the name of the test item(test_name), the ID of > the sub-category(cate_id), the description of the test item (test_desc,to be > discussed), and the person ID(user_id). > > > Description: > In line 15 of the "pgperffarm \ client \ benchmarks \ pgbench.py" file, I see > a note like this: > '' > # TODO allow running custom scripts, not just the default > '' > Now that I want to allow users to run custom test scripts and upload them, I > think it is necessary to tell others how to run the test again. So I want to > add a test_desc field that will store the details about this test and tell > the user how to run this test. But I am not too sure about the storage format > for the details of the test, perhaps the rich text format or markdown format > is a suitable choice. > When this test item is added by the administrator, the user_id field has a > value of 0. Otherwise, this field corresponds to the user_id field in the > user table. For this field, I prefer not to use foreign keys. > > > Related Functions: > - At the front end, each test has its own detail page, on which the test > related content is presented and a list of test results is listed. > - In the admin background need a page to manage test items. > - > To create a datasheet: pg_perf_test_result > > > Overview: > The pg_perf_test_result table is used to store test results, including at > least the result ID(result_id), user ID (user_id,I prefer not to create a > user-test result association table), test item ID(test_id), test branch > number(branch_id), system configuration(os_config), pg > configuration(pg_config), test result details(result_desc) , test > time(add_time) and other fields. > Confusion: > I think compared to other tables, pg_perf_test_result table may be a > relatively complex one. > This piece of code can be seen around line 110 of the "pgperffarm \ client \ > benchmarks \ runner.py" file: > '' > r ['meta'] = { > 'benchmark': config ['benchmark'], > 'date': strftime ("% Y-% m-% d% H:% M:% S.00 + 00", > gmtime ()), > 'name': config_name, > 'uname': uname, > } > > > with open ('% s / results.json'% self._output, 'w') as f: > f.write (json.dumps (r, indent = 4)) > '' > Could you please provide a sample results.json file so that I can better > understand what information is contained in the uploaded data and design the > datasheet based on it. Don't let this distract you too much from finishing your current term. There will be plenty of time to hammer out the schema. Here's a brief description of the data that is summarized in a json object: The idea is that the json do
Re: PATCH: Configurable file mode mask
On 3/6/18 10:04 PM, Michael Paquier wrote: > On Tue, Mar 06, 2018 at 01:32:49PM -0500, David Steele wrote: >> On 3/5/18 10:46 PM, Michael Paquier wrote: > >>> Those two are separate issues. Could you begin a new thread on the >>> matter? This will attract more attention. >> >> OK, I'll move it back for now, and make a note to raise the position as >> a separate issue. I'd rather not do it in this fest, though. > > Seems like you forgot to re-add the chmod calls in initdb.c. Hmmm, I thought we were talking about moving the position of umask(). I don't think the chmod() calls are needed because umask() is being set. The tests show that the config files have the proper permissions after inidb, so this just looks like redundant code to me. But, I'm not going to argue if you think they should go back. It would make the patch less noisy. >>> - if (chmod(location, S_IRWXU) != 0) >>> + current_umask = umask(0); >>> + umask(current_umask); >>> [...] >>> - if (chmod(pg_data, S_IRWXU) != 0) >>> + if (chmod(pg_data, PG_DIR_MODE_DEFAULT & ~file_mode_mask) != 0) >>> Such modifications should be part of patch 3, not 2, where the group >>> features really apply. >> >> The goal of patch 2 is to refactor the way permissions are set by >> replacing locally hard-coded values with centralized values, so I think >> this makes sense here. > > Hm. I would have left that out in this first version, now I am fine to > defer that to a committer's opinion as well. OK - I really do think it belongs here. A committer may not agree, of course. >>> my $test_mode = shift; >>> >>> + umask(0077); >> >> Added. (Ensure that all files are created with default data dir >> permissions). > > + # Ensure that all files are created with default data dir permissions > + umask(0077); > I have stared at this comment two minutes to finally understand that > this refers to the extra files which are part of this test. It may be > better to be a bit more precise here. I thought first that this > referred as well to setup_cluster calls... Updated to, "Ensure that all files created in this module for testing are set with default data dir permissions." > +# Ensure all permissions in the pg_data directory are > correct. Permissions > +# should be dir = 0700, file = 0600. > +sub check_pg_data_perm > +{ > check_permission_recursive() would be a more adapted name. Stuff in > TestLib.pm is not necessarily linked to data folders. When we add group permissions we need to have special logic around postmaster.pid. This should be 0600 even if the rest of the files are 0640. I can certainly remove that special logic in 02 and make this function more generic, but then I think we would still have to add check_pg_data_perm() in patch 03. Another way to do this would be to make the function generic and stipulate that the postmaster must be shut down before running the function. We could check postmaster.pid permissions as a separate test. What do you think? > sub clean_rewind_test > { > + ok (check_pg_data_perm($node_master->data_dir(), 0)); > + > $node_master->teardown_node if defined $node_master; > I have also a pending patch for pg_rewind which adds read-only files in > the data folders of a new test, so this would cause this part to blow > up. Testing that for all the test sets does not bring additional value > as well, and doing it at cleanup phase is also confusing. So could you > move that check into only one test's script? You could remove the umask > call in 003_extrafiles.pl as well this way, and reduce the patch diffs. I think I would rather have a way to skip the permission test rather than disable it for most of the tests. pg_rewind does more writes into PGDATA that anything other than a backend. Good coverage seems like a plus. > + if ($file_mode != 0600) > + { > + print(*STDERR, "$File::Find::name mode must be 0600\n"); > + > + $result = 0; > + return > + } > 0600 should be replaced by $expected_file_perm, and isn't a ';' missing > for each return "clause" (how does this even work..)? Well, 0600 is a special case -- see above. As for the missing semi-colon, well that's Perl for you. Fixed. > Patch 2 is getting close for a committer lookup I think, still need to > look at patch 3 in details.. And from patch 3: > # Expected permission > - my $expected_file_perm = 0600; > - my $expected_dir_perm = 0700; > + my $expected_file_perm = $allow_group ? 0640 : 0600; > + my $expected_dir_perm = $allow_group ? 0750 : 0700; > Or $expected_file_perm and $expected_dir_perm could just be used as > arguments. This gets back to the check_pg_data_perm() discussion above. I'll hold off on another set of patches until I hear back from you. There were only trivial changes as noted above. Thanks, -- -David da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [PATCH][PROPOSAL] Add enum releation option type
Nikolay Shaplov wrote: > Actually that's me who have lost it. Yeah, I realized today when I saw your reply to Nikita. I didn't realize it was him submitting a new version of the patch. > The code with oxford comma would be a > bit more complicated. We should put such coma when we have 3+ items and do > not > put it when we have 2. > > Does it worth it? > > As I've read oxford using of comma is not mandatory and used to avoid > ambiguity. > "XXX, YYY and ZZZ" can be read as "XXX, YYY, ZZZ" or as "XXX, (YYY and ZZZ)". > oxford comma is used to make sure that YYY and ZZZ are separate items of the > list, not an expression inside one item. > > But here we hardly have such ambiguity. Gracious goodness -- the stuff these Brits come up with! > So I'll ask again, do you really think it worth it? I'm not qualified to answer this question. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead
В письме от 2 марта 2018 11:27:49 пользователь Andres Freund написал: > > Since I get a really big patch as a result, it was decided to commit it in > > parts. > > I get that, but I strongly suggest not creating 10 loosely related > threads, but keeping it as a patch series in one thread. It's really > hard to follow for people not continually paying otherwise. Oups. Sorry I thought I should do other way round and create a new thread for new patch. And tried to post a link to other threads for connectivity wherever I can. Will do it as you say from now on. But I am still confused what should I do if I am commiting two patch from the patch series in simultaneously... -- Do code for fun. signature.asc Description: This is a digitally signed message part.
Re: Limit global default function execution privileges
Greetings, * David G. Johnston (david.g.johns...@gmail.com) wrote: > Since we are discussing locking down our defaults is revoking the global > function execution privilege granted to PUBLIC - instead limiting it to > just the pg_catalog schema - on the table? > > I'm not sure how strongly I feel toward the proposal but it does come up on > these lists; and the fact that it doesn't distinguish between security > definer and security invoker is a trap for the unaware. I wouldn't limit it to the pg_catalog schema, I'd just explicitly mark the functions in pg_catalog which should have EXECUTE rights available to PUBLIC. I'm afraid this would cause a lot of work for people who use a lot of pl/pgsql, but it might be a good thing in the end. Environments could configure ALTER DEFAULT PRIVILEGES to automatically install the GRANT back if they wanted it, and pg_dump would just pull through whatever the privileges actually were on old systems into the new systems. This definitely comes up regularly when introducing new people to PostgreSQL. Thanks! Stephen
Re: BUG #14941: Vacuum crashes
On 3/6/18, 11:04 PM, "Michael Paquier" wrote: > + if (!(options & VACOPT_SKIP_LOCKED)) > + relid = RangeVarGetRelid(vrel->relation, AccessShareLock, > false); > + else > + { > + relid = RangeVarGetRelid(vrel->relation, NoLock, false); > Yeah, I agree with Andres that getting all this logic done in > RangeVarGetRelidExtended would be cleaner. Let's see where the other > thread leads us to: > https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de I had missed that thread. Thanks for pointing it out. > + /* > +* We must downcase the statement type for log message > consistency > +* between expand_vacuum_rel(), vacuum_rel(), and analyze_rel(). > +*/ > + stmttype_lower = asc_tolower(stmttype, strlen(stmttype)); > This blows up on multi-byte characters and may report incorrect relation > name if double quotes are used, no? At the moment, stmttype is either "VACUUM" or "ANALYZE", but I suppose there still might be multi-byte risk in translations. > + /* > +* Since autovacuum workers supply OIDs when calling vacuum(), no > +* autovacuum worker should reach this code, and we can make > +* assumptions about the logging levels we should use in the > checks > +* below. > +*/ > + Assert(!IsAutoVacuumWorkerProcess()); > This is a good idea, should be a separate patch as other folks tend to > be confused with relid handling in expand_vacuum_rel(). Will do. > + Specifies that VACUUM should not wait for any > + conflicting locks to be released: if a relation cannot be locked > + immediately without waiting, the relation is skipped > Should this mention as well that no errors are raised, allowing the > process to move on with the next relations? IMO that is already covered by saying the relation is skipped, although I'm not against explicitly stating it, too. Perhaps we could outline the logging behavior as well. Nathan
Limit global default function execution privileges
Since we are discussing locking down our defaults is revoking the global function execution privilege granted to PUBLIC - instead limiting it to just the pg_catalog schema - on the table? I'm not sure how strongly I feel toward the proposal but it does come up on these lists; and the fact that it doesn't distinguish between security definer and security invoker is a trap for the unaware. David J.
Re: public schema default ACL
Greeting Petr, all, * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 07/03/18 13:18, Stephen Frost wrote: > > Greetings, > > > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > >> Certain "market leader" database behaves this way as well. I just hope > >> we won't go as far as them and also create users for schemas (so that > >> the analogy of user=schema would be complete and working both ways). > >> Because that's one of the main reasons their users depend on packages so > >> much, there is no other way to create a namespace without having to deal > >> with another user which needs to be secured. > > > > I agree that we do *not* want to force role creation on schema creation. > > > >> One thing we could do to limit impact of any of this is having > >> DEFAULT_SCHEMA option for roles which would then be the first one in the > >> search_path (it could default to the role name), that way making public > >> schema work again for everybody would be just about tweaking the roles a > >> bit which can be easily scripted. > > > > I don't entirely get what you're suggesting here considering we already > > have $user, and it is the first in the search_path..? > > > > What I am suggesting is that we add option to set user's default schema > to something other than user name so that if people don't want the > schema with the name of the user auto-created, it won't be. We have ALTER USER joe SET search_path already though..? And ALTER DATABASE, and in postgresql.conf? What are we missing? > >>> opportunity to do so. I do think it would be too weird to create the > >>> schema > >>> in one database only. Creating it on demand might work. What would be > >>> the > >>> procedure, if any, for database owners who want to deny object creation in > >>> their databases? > >> > >> Well, REVOKE CREATE ON DATABASE already exists. > > > > That really isn't the same.. In this approach, regular roles are *not* > > given the CREATE right on the database, the system would just create the > > schema for them on login automatically if the role attribute says to do > > so. > > What's the point of creating schema for them if they don't have CREATE > privilege? They would own the schema and therefore have CREATE and USAGE rights on the schema itself. Creating objects checks for schema rights, it doesn't check for database rights- that's only if you're creating schemas. Thanks! Stephen
Re: public schema default ACL
On 07/03/18 13:18, Stephen Frost wrote: > Greetings, > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> Certain "market leader" database behaves this way as well. I just hope >> we won't go as far as them and also create users for schemas (so that >> the analogy of user=schema would be complete and working both ways). >> Because that's one of the main reasons their users depend on packages so >> much, there is no other way to create a namespace without having to deal >> with another user which needs to be secured. > > I agree that we do *not* want to force role creation on schema creation. > >> One thing we could do to limit impact of any of this is having >> DEFAULT_SCHEMA option for roles which would then be the first one in the >> search_path (it could default to the role name), that way making public >> schema work again for everybody would be just about tweaking the roles a >> bit which can be easily scripted. > > I don't entirely get what you're suggesting here considering we already > have $user, and it is the first in the search_path..? > What I am suggesting is that we add option to set user's default schema to something other than user name so that if people don't want the schema with the name of the user auto-created, it won't be. > >>> opportunity to do so. I do think it would be too weird to create the schema >>> in one database only. Creating it on demand might work. What would be the >>> procedure, if any, for database owners who want to deny object creation in >>> their databases? >> >> Well, REVOKE CREATE ON DATABASE already exists. > > That really isn't the same.. In this approach, regular roles are *not* > given the CREATE right on the database, the system would just create the > schema for them on login automatically if the role attribute says to do > so. What's the point of creating schema for them if they don't have CREATE privilege? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [PATCH][PROPOSAL] Add enum releation option type
В письме от 1 марта 2018 14:47:35 пользователь Alvaro Herrera написал: > I see you lost the Oxford comma: > > -DETAIL: Valid values are "on", "off", and "auto". > +DETAIL: Valid values are "auto", "on" and "off". > > Please put these back. Actually that's me who have lost it. The code with oxford comma would be a bit more complicated. We should put such coma when we have 3+ items and do not put it when we have 2. Does it worth it? As I've read oxford using of comma is not mandatory and used to avoid ambiguity. "XXX, YYY and ZZZ" can be read as "XXX, YYY, ZZZ" or as "XXX, (YYY and ZZZ)". oxford comma is used to make sure that YYY and ZZZ are separate items of the list, not an expression inside one item. But here we hardly have such ambiguity. So I'll ask again, do you really think it worth it? -- Do code for fun. signature.asc Description: This is a digitally signed message part.
Re: Implementing SQL ASSERTION
On Mon, Jan 15, 2018 at 09:14:02PM +, Joe Wildish wrote: > Hi David, > > > On 15 Jan 2018, at 16:35, David Fetter wrote: > > > > It sounds reasonable enough that I'd like to make a couple of Modest > > Proposals™, to wit: > > > > - We follow the SQL standard and make SERIALIZABLE the default > > transaction isolation level, and > > > > - We disallow writes at isolation levels other than SERIALIZABLE when > > any ASSERTION could be in play. > > Certainly it would be easy to put a test into the assertion check > function to require the isolation level be serialisable. I didn’t > realise that that was also the default level as per the standard. > That need not necessarily be changed, of course; it would be obvious > to the user that it was a requirement as the creation of an > assertion would fail without it, as would any subsequent attempts to > modify the involved tables. This patch no longer applies. Any chance of a rebase? Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: [PATCH][PROPOSAL] Add enum releation option type
В письме от 1 марта 2018 19:11:05 пользователь Nikita Glukhov написал: > Hi. > > I have refactored patch by introducing new struct relop_enum_element to make > it possible to use existing C-enum values in option's definition. So, > additional enum GIST_OPTION_BUFFERING_XXX was removed. Hi! I've imported yours relopt_enum_element solution. Since Alvaro liked it, and I like it to. But I called it relopt_enum_element_definition, as it is not an element itself, but a description of possibilities. But I do not want to import the rest of it. > #define RELOPT_ENUM_DEFAULT ((const char *) -1) /* pseudo-name for > default > value */ I've discussed this solution with my C-experienced friends, and each of them said, that it will work, but it is better not to use ((const char *) -1) as it will stop working without any warning, because it is not standard C solution and newer C-compiler can stop accepting such thing without further notice. I would avoid using of such thing if possible. > Also default option value should be placed now in the first element of > allowed_values[]. This helps not to expose default values definitions (like > GIST_BUFFERING_AUTO defined in gistbuild.c). For all other reloption types, default value is a part of relopt_* structure. I see no reason to do otherwise for enum. As for exposing GIST_BUFFERING_AUTO. We do the same for default fillfactor value. I see no reason why we should do otherwise here... And if we keep default value on relopt_enum, we will not need RELOPT_ENUM_DEFAULT that, as I said above, I found dubious. > for (elem = opt_enum->allowed_values; elem->name; elem++) It is better then I did. I imported that too. > if (validate && !parsed) Oh, yes... There was my mistake. I did not consider validate == false case. I should do it. Thank you for pointing this out. But I think that we should return default value if the data in pg_class is brocken. May be I later should write an additional patch, that throw WARNING if reloptions from pg_class can't be parsed. DB admin should know about it, I think... The rest I've kept as I do before. If you think that something else should be changed, I'd like to see, not only the changes, but also some explanations. I am not sure, for example that we should use same enum for reloptions and metapage buffering mode representation for example. This seems to be logical, but it may also confuse. I wan to hear all opinions before doing it, for example. May be typedef enum gist_option_buffering_numeric_values { GIST_OPTION_BUFFERING_ON = GIST_BUFFERING_STATS, GIST_OPTION_BUFFERING_OFF = GIST_BUFFERING_DISABLED, GIST_OPTION_BUFFERING_AUTO = GIST_BUFFERING_AUTO, } gist_option_buffering_value_numbers; will do, and then we can assign one enum to another, keeping the traditional variable naming? I also would prefer to keep all enum definition in an .h file, not enum part in .h file, and array part in .c. -- Do code for fun.diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index 46276ce..4b775ab 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -404,7 +404,11 @@ static relopt_real realRelOpts[] {{NULL}} }; -static relopt_string stringRelOpts[] +static relopt_enum_element_definition gist_buffering_enum_def[] + GIST_OPTION_BUFFERING_ENUM_DEF; +static relopt_enum_element_definition view_check_option_enum_def[] + VIEW_OPTION_CHECK_OPTION_ENUM_DEF; +static relopt_enum enumRelOpts[] { { { @@ -413,10 +417,8 @@ static relopt_string stringRelOpts[] RELOPT_KIND_GIST, AccessExclusiveLock }, - 4, - false, - gistValidateBufferingOption, - "auto" + gist_buffering_enum_def, + GIST_OPTION_BUFFERING_AUTO }, { { @@ -425,11 +427,14 @@ static relopt_string stringRelOpts[] RELOPT_KIND_VIEW, AccessExclusiveLock }, - 0, - true, - validateWithCheckOption, - NULL + view_check_option_enum_def, + VIEW_OPTION_CHECK_OPTION_NOT_SET }, + {{NULL}} +}; + +static relopt_string stringRelOpts[] +{ /* list terminator */ {{NULL}} }; @@ -476,6 +481,12 @@ initialize_reloptions(void) realRelOpts[i].gen.lockmode)); j++; } + for (i = 0; enumRelOpts[i].gen.name; i++) + { + Assert(DoLockModesConflict(enumRelOpts[i].gen.lockmode, + enumRelOpts[i].gen.lockmode)); + j++; + } for (i = 0; stringRelOpts[i].gen.name; i++) { Assert(DoLockModesConflict(stringRelOpts[i].gen.lockmode, @@ -514,6 +525,14 @@ initialize_reloptions(void) j++; } + for (i = 0; enumRelOpts[i].gen.name; i++) + { + relOpts[j] = &enumRelOpts[i].gen; + relOpts[j]->type = RELOPT_TYPE_ENUM; + relOpts[j]->namelen = strlen(relOpts[j]->name); + j++; + } + for (i = 0; stringRelOpts[i].gen.name; i++) { relOpts[j] = &stringRelOpts[i].gen; @@ -611,6 +630,9 @@ allocate_reloption(bits32 kinds, int type, const char *name, const char *desc) case RELOPT_TYPE_REAL: size = sizeof(relo
Re: public schema default ACL
Greetings, * Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote: > Stephen Frost wrote: > > > * Noah Misch (n...@leadboat.com) wrote: > > > > I like the idea of getting more SQL-compatible, if this presents a > > > distinct > > > opportunity to do so. I do think it would be too weird to create the > > > schema > > > in one database only. Creating it on demand might work. What would be > > > the > > > procedure, if any, for database owners who want to deny object creation in > > > their databases? > > > > My suggestion was that this would be a role attribute. If an > > administrator doesn't wish for that role to have a schema created > > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever > > we name it) role attribute to false. > > Is a single attribute enough? I think we need two: one would authorize > to create the schema $user to the user themselves (maybe > SELF_SCHEMA_CREATE); another would automatically do so when connecting > to a database that does not have it (perhaps AUTO_CREATE_SCHEMA). I don't see a use-case for this SELF_SCHEMA_CREATE attribute and it seems more likely to cause confusion than to be helpful. If the admin sets AUTO_CREATE_SCHEMA for a user then that's what we should do. > Now, maybe the idea of creating it as soon as a connection is > established is not great. What about creating it only when the first > object creation is attempted and there is no other schema to create in? > This avoid pointless proliferation of empty user schemas, as well as > avoid the overhead of checking existence of schem $user on each > connection. I don't see how creating schemas for roles which the admin has created with the AUTO_CREATE_SCHEMA option would be pointless. To not do so would be confusing, imo. Consider the user who logs in and doesn't realize that they're allowed to create a schema and doesn't see a schema of their own in the list- they aren't going to think "I should just try to create an object and see if a schema appears", they're going to ask the admin why they don't have a schema. * Tom Lane (t...@sss.pgh.pa.us) wrote: > Hmm. On first glance that sounds bizarre, but we do something pretty > similar for the pg_temp schemas, so it could likely be made to work. While I agree that it might not be that hard to make the code do it, since we do this for temp schemas, I still don't see real value in it and instead just a confusing system where schemas "appear" at some arbitrary point when the user happens to try to create an object without qualification. I liken this to a well-known and well-trodden feature for auto creating user home directories on Unix. Being different from that for, at best, rare use-cases which could be handled in other ways is going against POLA. If an admin is concerned about too many empty schemas or about having $user in a search_path and needing to search it, then those are entirely fixable rather easily, but those are the uncommon cases in my experience. > One issue to think about is exactly which $user we intend to make the > schema for, if we've executed SET SESSION AUTHORIZATION, or are inside > a SECURITY DEFINER function, etc etc. I'd argue that only the original > connection username should get this treatment, which may mean that object > creation can fail in those contexts. This just strengthens the "this will be confusing to our users" argument, imv. Thanks! Stephen
Re: GSoC 2017: weekly progress reports (week 6)
Andres Freund wrote: > This appears to be a duplicate of https://commitfest.postgresql.org/17/1466/ > - as the other one is older, I'm closing this one. This comment makes no sense from the POV of the mail archives. I had to look at the User-Agent in your email to realize that you wrote it in the commitfest app. I see three problems here 1. impersonating the "From:" header is a bad idea; needs fixed much as we did with the bugs and doc comments submission forms 2. it should have had a header indicating it comes from CF app 3. it would be great to include in said header a link to the CF entry to which the comment was attached. Thanks -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services