Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 25 December 2014 at 17:09, Jaime Casanova ja...@2ndquadrant.com wrote: On Thu, Dec 25, 2014 at 5:42 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: To summarise for people who haven't followed the thread in detail, the idea is that you would do: grant select on foo to audit; ...and the server would audit-log any select ... from foo ... queries (by any user). what if i want to audit different things for different users? That is supported - you can provide a list of roles to be audit roles. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CATUPDATE confusion?
On Mon, Nov 24, 2014 at 04:28:46PM -0500, Adam Brightwell wrote: So, where I find this confusing is that the documentation supports this functionality and the check keeps superuser separate from CATUPDATE... however... comments and implementation in user.c state/do the opposite and couple them together. Documentation: http://www.postgresql.org/docs/9.4/static/catalog-pg-authid.html - Role can update system catalogs directly. (Even a superuser cannot do this unless this column is true) src/backend/commands/user.c /* superuser gets catupdate right by default */ new_record[Anum_pg_authid_rolcatupdate - 1] = BoolGetDatum(issuper); and... /* * issuper/createrole/catupdate/etc * * XXX It's rather unclear how to handle catupdate. It's probably best to * keep it equal to the superuser status, otherwise you could end up with * a situation where no existing superuser can alter the catalogs, * including pg_authid! */ if (issuper = 0) { new_record[Anum_pg_authid_rolsuper - 1] = BoolGetDatum(issuper 0); new_record_repl[Anum_pg_authid_rolsuper - 1] = true; new_record[Anum_pg_authid_rolcatupdate - 1] = BoolGetDatum(issuper 0); new_record_repl[Anum_pg_authid_rolcatupdate - 1] = true; } That documentation is correct as far it goes. It does neglect to mention that, as you have discovered, any CREATE ROLE or ALTER ROLE [NO]SUPERUSER will change rolcatupdate to match. Perhaps this is not an issue but it seemed odd to me, especially after giving Peter's comment more thought. So, I suppose the question is whether or not a superuser check should be added to 'has_rolcatupdate' or not? I I would not. PostgreSQL has had this feature since day one (original name usecatupd). It has two use cases, (1) giving effect to non-superuser catalog grants and (2) preventing a narrow class of superuser mistakes. We wouldn't add such a thing today, but one can safely ignore its existence. Making has_rolcatupdate() approve superusers unconditionally would exclude use case (2). Neither use case is valuable, but if I had to pick, (2) is more valuable than (1). believe I understand the reasoning for coupling the two at role creation/alter time, however, is there really a case where a superuser wouldn't be allowed to bypass this check? Based on the comments, there seems like there is potentially enough concern to allow it. And if it is allowed, couldn't CATUPDATE then be treated like every other attribute and the coupling with superuser removed? Thoughts? A superuser always has _some_ way to bypass the check, if nothing else by loading a C-language function to update pg_authid. The user.c code you quoted ensures that, if the DBA manages to remove rolcatupdate from every user, a simple CREATE ROLE or ALTER ROLE can fix things. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 25 December 2014 at 10:42, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Stephen likes the idea, obviously; Simon also said he liked it, but I now wonder if he may have liked the part I implemented (which allows a hot standby to have a different auditing configuration than the primary) but not fully realised the remainder of the proposal. I am happy with the proposal, I just thought you'd already done it. Before I go much further, how do others feel about it? To summarise for people who haven't followed the thread in detail, the idea is that you would do: grant select on foo to audit; GRANT is understood and supported by many people and tools. The idea makes auditing immediately accessible for wide usage. …and the server would audit-log any select … from foo … queries (by any user). One immediate consequence is that only things you could grant permissions for could be audited (by this mechanism), but I guess that's a problem only in the short term. Another consequence is that you can't audit selects on foo only by role x and selects on bar only by role y. Also, what makes the audit role magical? I think it's because it exists only to receive these negative grants, there's no other magic involved. Stephen also said «Note that this role, from core PG's perspective, wouldn't be special at all». I don't see them as negative grants. You are simply specifying the privilege and object you want logged. Abhijit is right to point out that we can't specify all combinations of actions, but solving that is considerably more complex. At the moment we don't have strong evidence that we should wait for such additional complexity. We can improve things in next release if it is truly needed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
At 2014-12-27 08:08:32 +, si...@2ndquadrant.com wrote: what if i want to audit different things for different users? That is supported - you can provide a list of roles to be audit roles. You can do that, but maybe it isn't quite enough to do what Jaime is asking for. Not always, at least. Consider: 1. You want to audit select on foo by user jaime (only). 2. You want to audit update on bar by user simon (only). So you do something like this: pgaudit.roles = 'audit1, audit2' grant select on foo to audit1; grant update on bar to audit2; Now if Jaime does select on foo or if Simon does update on bar, it'll be logged. If user jaime does not have permission to do update on bar, and if simon doesn't have permission to select on foo, everything is fine. But there's no way to say *don't* audit select on foo by simon. You could do something like grant role audit1 to jaime and then check the audit-permissions of whichever audit role jaime is found at runtime to inherit from. But in Stephen's original proposal, granting any audit role to any user meant that everything they did would be logged. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serialization exception : Who else was involved?
On Tue, Dec 02, 2014 at 11:17:43AM +0100, Olivier MATROT wrote: Serialization conflict detection is done in src/backend/storage/lmgr/predicate.c, where transactions that are doomed to fail are marked as such with the SXACT_FLAG_DOOMED flag. I simply added elog(...) calls with the NOTIFY level, each time the flag is set, compiled the code and give it a try. I would like to see this useful and simple addition in a future version of PostgreSQL. Is it in the spirit of what is done when it comes to ease the work of the developer ? May be the level I've chosen is not appropriate ? I would value extra logging designed to help users understand the genesis of serialization failures. A patch the community would adopt will probably have more complexity than your quick elog(NOTICE, ...) addition. I don't have a clear picture of what the final patch should be, but the following are some thoughts to outline the problem space. See [1] for an earlier discussion. The logging done in DeadLockReport() is a good baseline; it would be best to consolidate a similar level of detail and report it all as part of the main serialization failure report. That may prove impractical. If transaction TA marks transaction TB's doom, TA can be long gone by the time TB reports its serialization failure. TA could persist the details needed for that future error report, but that may impose costs out of proportion with the benefit. If so, we can fall back on your original idea of emitting a message in the command that performs the flag flip. That would need a DEBUG error level, though. Sending a NOTICE to a client when its transaction dooms some other transaction would add noise in the wrong place. Thanks, nm [1] http://www.postgresql.org/message-id/flat/AANLkTikF-CR-52nWAo2VG_348aTsK_+0i=chbpnqd...@mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serialization exception : Who else was involved?
On 12/02/2014 06:17 PM, Olivier MATROT wrote: I was wondering if there was a log level in PostgreSQL that could tell me which query was the trigger of a doomed transaction. It's not necessarily possible to tell which *query* in another transaction caused the current one to fail. It might not be a single query in the local session or the other session. What can be identified is the other transaction ID. If you set log_line_prefix to include the txid and pid, and you log_statement = 'all', you can examine the logs to see what happened. I admit it's pretty clumsy. It'd be very nice to provide more information on the causes of failures - but I suspect doing so *efficiently*, without making serializable a huge impact on performance, would be quite challenging. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serialization exception : Who else was involved?
On 12/02/2014 06:17 PM, Olivier MATROT wrote: Serialization conflict detection is done in *src/backend/storage/lmgr/predicate.c*, where transactions that are doomed to fail are marked as such with *the SXACT_FLAG_DOOMED* flag. I simply added elog(...) calls with the NOTIFY level, each time the flag is set, compiled the code and give it a try. I don't see how that'd necessarily correctly identify the query/queries in the other tx that're involved, though. Perhaps I'm thinking in terms of more complicated serialization failures? What sorts of failures are you actually running into? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CATUPDATE confusion?
* Noah Misch (n...@leadboat.com) wrote: On Mon, Nov 24, 2014 at 04:28:46PM -0500, Adam Brightwell wrote: Perhaps this is not an issue but it seemed odd to me, especially after giving Peter's comment more thought. So, I suppose the question is whether or not a superuser check should be added to 'has_rolcatupdate' or not? I I would not. PostgreSQL has had this feature since day one (original name usecatupd). It has two use cases, (1) giving effect to non-superuser catalog grants and (2) preventing a narrow class of superuser mistakes. We wouldn't add such a thing today, but one can safely ignore its existence. Making has_rolcatupdate() approve superusers unconditionally would exclude use case (2). Neither use case is valuable, but if I had to pick, (2) is more valuable than (1). What's confusing about this is that use-case (2) appears to have been the intent of the option, but because it's tied directly to superuser and isn't an independently controllable option, the only way change it is to do an 'update pg_authid ...'. Even when set that way, it isn't visible that it's been set through \du, you have to query the catalog directly. I wonder if the option had originally been intended to be independent and *not* given to superusers by default, but because of the concern about dealing with corrupt systems, etc, it was never actually done. I'm fine with the line of thinking that says we can't deny superusers this power because of corruption/debugging concerns, but if that's the case, then there is no use-case (2) and we should just remove the option entirely. I don't buy into use-case (1) above being at all reasonable. With it, one can trivially make themselves a superuser, and in many ways the catupdate capability is *more* dangerous than superuser- if you have catupdate but not superuser, you'd be tempted to hack at the catalog to make the changes you want instead of using the regular ALTER TABLE, etc, commands. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 27 December 2014 at 08:47, Abhijit Menon-Sen a...@2ndquadrant.com wrote: But there's no way to say *don't* audit select on foo by simon. We can cover what it does and does not do in some doc examples. When submitted, pgaudit didn't have very complex auditing rules. Stephen's suggestion improves that considerably, but isn't the only conceivable logging rule. But we'll need to see what else is needed; I doubt we'll need everything, at least not in PG9.5 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
* Simon Riggs (si...@2ndquadrant.com) wrote: On 27 December 2014 at 08:47, Abhijit Menon-Sen a...@2ndquadrant.com wrote: But there's no way to say *don't* audit select on foo by simon. We can cover what it does and does not do in some doc examples. When submitted, pgaudit didn't have very complex auditing rules. Stephen's suggestion improves that considerably, but isn't the only conceivable logging rule. But we'll need to see what else is needed; I doubt we'll need everything, at least not in PG9.5 Agreed, it allows us much more flexibility, but it isn't a panacea. I'm hopeful that it'll be flexibile enough for certain regulatory-required use-cases. In any case, it's much closer and is certainly worthwhile even if it doesn't allow for every possible configuration or ends up not meeting specific regulatory needs because it moves us to a place where we can sensibly consider what else is needed? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] CATUPDATE confusion?
Stephen Frost sfr...@snowman.net writes: * Noah Misch (n...@leadboat.com) wrote: On Mon, Nov 24, 2014 at 04:28:46PM -0500, Adam Brightwell wrote: Perhaps this is not an issue but it seemed odd to me, especially after giving Peter's comment more thought. So, I suppose the question is whether or not a superuser check should be added to 'has_rolcatupdate' or not? I I would not. PostgreSQL has had this feature since day one (original name usecatupd). It has two use cases, (1) giving effect to non-superuser catalog grants and (2) preventing a narrow class of superuser mistakes. We wouldn't add such a thing today, but one can safely ignore its existence. Making has_rolcatupdate() approve superusers unconditionally would exclude use case (2). Neither use case is valuable, but if I had to pick, (2) is more valuable than (1). What's confusing about this is that use-case (2) appears to have been the intent of the option, Yes. Noah's claim that anybody intended (1) seems to be mere historical revisionism, especially since as you note CATUPDATE could be trivially parlayed into superuser if someone were to grant it to a non-superuser. but because it's tied directly to superuser and isn't an independently controllable option, the only way change it is to do an 'update pg_authid ...'. Even when set that way, it isn't visible that it's been set through \du, you have to query the catalog directly. This is not hard to understand either: the option dates from long before there was any concerted effort to invent bespoke SQL commands for every interesting catalog manipulation. If CATUPDATE had any significant use, we'd have extended ALTER USER (and \du, and pg_dump ...) at some point to make it more easily controllable. But since it's of such marginal usefulness, nobody ever bothered; and I doubt we should bother now. In short, I agree with Noah's conclusion (do nothing) though not his reasoning. Plan C (remove CATUPDATE altogether) also has some merit. But adding a superuser override there would be entirely pointless. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 12/27/2014 12:16 AM, Alvaro Herrera wrote: Tom Lane wrote: The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to they use PGSTAT_STAT_INTERVAL like everybody else? The point of wanting fresher stats than that, eons ago, was to avoid a worker vacuuming a table that some other worker vacuumed more recently than PGSTAT_STAT_INTERVAL. I realize now that the semantics we really want was something like stats no older than XYZ where the given value is the timestamp at which we start checking; if we get anything newer than that it would be okay, but we currently reject it because of lack of a more appropriate API. (If it takes more than PGSTAT_STAT_INTERVAL to get the stats back, a regular backend would ask for fresher stats, but to an autovac worker they would be good enough as long as they are newer than its recheck start time.) Nowadays we can probably disregard the whole issue, since starting a new vacuum just after the prior one finished should not cause much stress to the system thanks to the visibility map. Vacuuming is far from free, even if the visibility map says that most pages are visible to all: you still scan all indexes, if you remove any dead tuples at all. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Tue, Dec 23, 2014 at 11:55 AM, Peter Geoghegan p...@heroku.com wrote: On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes jeff.ja...@gmail.com wrote: I've put this through an adaptation of my usual torture test, and it ran fine until wraparound shutdown. I'll poke at it more later. Could you elaborate, please? What are the details of the torture test you're performing? I've uploaded it here. https://drive.google.com/folderview?id=0Bzqrh1SO9FcEZ3plX0l5RWNXd00usp=sharing The gist of it is that I increment a count column of a random row (via pk) in multiple connections simultaneously. When the server crashes, or it gets to a certain number of increments, the threads report their activity up to the parent, which then waits for automatic recovery and compares the state of the database to the reported state of the children threads. That is for my original code. For this purpose, I made the count go either up or down randomly, and when a row's count passes through zero it gets deleted. Then when it is chosen for increment/decrement again, it has to be inserted. I've made this happen either through a update-or-insert-or-retry loop (two variants) or by using your new syntax. There is a patch which adds a simulation for a torn-page-write followed by a crash, and also adds some elogs that I've sometimes found useful for tracking down problems, with new GUCs to control them. I don't think you made changes to the WAL/recovery routines, so I don't expect crashing recovery to be a big hazard for your patch, but I wanted to run a test where I was generally familiar with the framework, and thought an independently derived test might exercise some new aspects. The one thing I noticed is that using your syntax starts out slightly slower than the retry loop, but then gets much slower (down by 2 or 3 times) after a while. It might be a vacuuming issue. The constant intentional crashes interferes with good vacuuming behavior, and I need to retest this with the intentional crashes turned off to see if that fixes it. I'm having difficult access to my usual testing hardware over the holidays, so I'm not getting as much done as I hoped. I'll try to look at your own stress tests on github as well. Cheers, Jeff
[HACKERS] Re: Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries
On Wed, Nov 26, 2014 at 01:34:18PM +0200, Ants Aasma wrote: glibc-2.3.4+ has a flag called RTLD_DEEPBIND for dlopen that prefers symbols loaded by the library to those provided by the caller. Using this flag fixes my issue, PostgreSQL gets the ldap functions from libldap, Oracle client gets them from whatever it links to. Both work fine. Attached is a patch that enables this flag on Linux when available. This specific case could also be fixed by rewriting oracle_fdw to use dlopen for libclntsh.so and pass this flag, but I think it would be better to enable it for all PostgreSQL loaded extension modules. I can't think of a sane use case where it would be correct to prefer PostgreSQL loaded symbols to those the library was actually linked against. Does anybody know of a case where this flag wouldn't be a good idea? There's a meta-downside that any bug the flag prevents will still exist on non-glibc targets, and that's the primary reason not to make this change in core PostgreSQL. Most hackers use GNU/Linux as a primary development platform, so RTLD_DEEPBIND would delay discovery of still-present bugs. Standard POSIX symbol resolution has some advantages over RTLD_DEEPBIND. Any given program has at most one global symbol called malloc, and so-named symbols usually get away with assuming they own the brk() space. LD_PRELOAD can overload any global symbol. Those points by themselves would not stop me from using RTLD_DEEPBIND in specific cases like oracle_fdw, though. Are there any similar options for other platforms? Alternatively, does anyone know of linker flags that would give a similar effect? It has some overlap with the -Bsymbolic linker option. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Sat, Dec 27, 2014 at 11:48 AM, Jeff Janes jeff.ja...@gmail.com wrote: Could you elaborate, please? What are the details of the torture test you're performing? The gist of it is that I increment a count column of a random row (via pk) in multiple connections simultaneously. This is great. In general, I strongly believe that we should be doing this kind of thing more formally and more frequently. Thanks! That is for my original code. For this purpose, I made the count go either up or down randomly, and when a row's count passes through zero it gets deleted. Then when it is chosen for increment/decrement again, it has to be inserted. I've made this happen either through a update-or-insert-or-retry loop (two variants) or by using your new syntax. Did you continue to limit your investigation to value locking approach #1? I think that #2 is the more likely candidate for commit, that we should focus on. However, #1 is more conceptually pure, and is therefore an interesting basis of comparison with #2 when doing this kind of testing. There is a patch which adds a simulation for a torn-page-write followed by a crash, and also adds some elogs that I've sometimes found useful for tracking down problems, with new GUCs to control them. Cool. I don't think you made changes to the WAL/recovery routines, so I don't expect crashing recovery to be a big hazard for your patch, but I wanted to run a test where I was generally familiar with the framework, and thought an independently derived test might exercise some new aspects. Value locking approach #2 does touch crash recovery. Value locking approach #1 does not. I certainly see the logic in starting with independently derived tests. We all have our blind spots. The one thing I noticed is that using your syntax starts out slightly slower than the retry loop, but then gets much slower (down by 2 or 3 times) after a while. It might be a vacuuming issue. Interesting. I'd like to compare both approaches to value locking here. I'll try to look at your own stress tests on github as well. Would you be opposed to merging your custom stress-test suite into my git repo? I'll give you the ability to push to it. I can help you out if you think you'd benefit from access to my Quad-core server (Intel Core i7-4770) for stress-testing. I'll coordinate with you about it privately. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CATUPDATE confusion?
* Tom Lane (t...@sss.pgh.pa.us) wrote: Plan C (remove CATUPDATE altogether) also has some merit. But adding a superuser override there would be entirely pointless. This is be my recommendation. I do not see the point of carrying the option around just to confuse new users of pg_authid and reviewers of the code. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] CATUPDATE confusion?
Stephen Frost sfr...@snowman.net writes: * Tom Lane (t...@sss.pgh.pa.us) wrote: Plan C (remove CATUPDATE altogether) also has some merit. But adding a superuser override there would be entirely pointless. This is be my recommendation. I do not see the point of carrying the option around just to confuse new users of pg_authid and reviewers of the code. Yeah ... if no one's found it interesting in the 20 years since the code left Berkeley, it's unlikely that interest will emerge in the future. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CATUPDATE confusion?
On Sat, Dec 27, 2014 at 06:26:02PM -0500, Tom Lane wrote: Stephen Frost sfr...@snowman.net writes: * Tom Lane (t...@sss.pgh.pa.us) wrote: Plan C (remove CATUPDATE altogether) also has some merit. But adding a superuser override there would be entirely pointless. This is be my recommendation. I do not see the point of carrying the option around just to confuse new users of pg_authid and reviewers of the code. Yeah ... if no one's found it interesting in the 20 years since the code left Berkeley, it's unlikely that interest will emerge in the future. No objection here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initdb-cs_CZ.WIN-1250 buildfarm failures
On Tue, Dec 23, 2014 at 03:32:59PM +0100, Tomas Vondra wrote: On 23.12.2014 15:21, Andrew Dunstan wrote: No, config_opts is what's passed to configure. Try something like: if ($branch eq 'REL9_0_STABLE') { $skip_steps{'pl-install-check'} = 1; } Applied to all three animals. As of the time of this change, the animals ceased to report in. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Heikki Linnakangas hlinnakan...@vmware.com writes: On 12/27/2014 12:16 AM, Alvaro Herrera wrote: Tom Lane wrote: The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to they use PGSTAT_STAT_INTERVAL like everybody else? The point of wanting fresher stats than that, eons ago, was to avoid a worker vacuuming a table that some other worker vacuumed more recently than PGSTAT_STAT_INTERVAL. ... Nowadays we can probably disregard the whole issue, since starting a new vacuum just after the prior one finished should not cause much stress to the system thanks to the visibility map. Vacuuming is far from free, even if the visibility map says that most pages are visible to all: you still scan all indexes, if you remove any dead tuples at all. With typical autovacuum settings, I kinda doubt that there's much value in reducing the window for this problem from 500ms to 10ms. As Alvaro says, this was just a partial, kluge solution from the start --- if we're worried about such duplicate vacuuming, we should undertake a real solution that closes the window altogether. In any case, timeouts occurring inside autovacuum are not directly causing the buildfarm failures, since autovacuum's log entries don't reflect into regression outputs. (It's possible that autovacuum's tight tolerance is contributing to the failures by increasing the load on the stats collector, but I'm not sure I believe that.) To get back to that original complaint about buildfarm runs failing, I notice that essentially all of those failures are coming from wait timeout warnings reported by manual VACUUM commands. Now, VACUUM itself has no need to read the stats files. What's actually causing these messages is failure to get a timely response in pgstat_vacuum_stat(). So let me propose a drastic solution: let's dike out this bit in vacuum.c: /* * Send info about dead objects to the statistics collector, unless we are * in autovacuum --- autovacuum.c does this for itself. */ if ((vacstmt-options VACOPT_VACUUM) !IsAutoVacuumWorkerProcess()) pgstat_vacuum_stat(); This would have the effect of transferring all responsibility for dead-stats-entry cleanup to autovacuum. For ordinary users, I think that'd be just fine. It might be less fine though for people who disable autovacuum, if there still are any. To analyze the effects on them, let's break down what pgstat_vacuum_stat() actually does: 1. It causes drops of whole databases from pgstat's tables. I claim we don't need to do that at this level. There's no issue unless the stats collector missed the PgStat_MsgDropdb message when the database was dropped; and even if it did, the negative consequences of having a dead DB's stats still lying around are pretty minimal since we split up the stats files. There will be no extra I/O except for one small record in the global stats file. So I think we can well afford to say that with autovac off, such missed databases only get cleaned up the next time an autovac is forced for antiwraparound. 2. Within the current database, it causes drops of pgstat entries for dropped tables. This would be a tad annoying if you have lots of transient tables and no or minimal autovacuuming. However, lots of transient tables can be a pain point anyway, because we don't currently have any mechanism other than pgstat_vacuum_stat() for cleaning up per-table stats for dropped tables. It seems like it might be time to do something about that. We could probably extend the PgStat_TableXactStatus infrastructure to keep track of whether a table was created or deleted in the current transaction, and solve the problem without very much new code. Or maybe we could move the responsibility for reporting stale entries into the code that reads the stats files for the stats views. 3. Within the current database, it causes drops of pgstat entries for dropped functions, if you're tracking per-function execution stats. Since that's not the default, maybe we could get away with saying that we don't clean up such dead entries very often unless you're running autovacuum. I don't really want to propose building the infrastructure to support dropping such entries retail. Or, much more simply, we could conclude that it's not that important if pgstat_vacuum_stat() is unable to get up-to-date stats, and rejigger the code so that we don't bleat when the file-reading request comes from that source. This should be a back-patchable fix, whereas #2 above might be a bit too complicated for that. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup fails with long tablespace paths
On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut pete...@gmx.net wrote: On 12/22/14 10:00 PM, Amit Kapila wrote: There is already a patch [1] in this CF which will handle both cases, so I am not sure if it is very good idea to go with a new tar format to handle this issue. I think it would still make sense to have proper symlinks in the basebackup if possible, for clarity. I guess I would have assumed it would be more clear to omit the symlinks if we're expecting the server to put them in. Otherwise, the server has to remove the existing symlinks and create new ones, which introduces various possibilities for failure and confusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Wed, Dec 24, 2014 at 11:20 AM, Andres Freund and...@2ndquadrant.com wrote: I just verified that I can still reproduce the problem: # aligned case (max_connections=401) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 405170.2 tps, lat 0.195 ms stddev 0.928 progress: 2.0 s, 467011.1 tps, lat 0.204 ms stddev 0.140 progress: 3.0 s, 462832.1 tps, lat 0.205 ms stddev 0.154 progress: 4.0 s, 471035.5 tps, lat 0.202 ms stddev 0.154 progress: 5.0 s, 500329.0 tps, lat 0.190 ms stddev 0.132 BufferDescriptors is at 0x7f63610a6960 (which is 32byte aligned) # unaligned case (max_connections=400) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 202271.1 tps, lat 0.448 ms stddev 1.232 progress: 2.0 s, 223823.4 tps, lat 0.427 ms stddev 3.007 progress: 3.0 s, 227584.5 tps, lat 0.414 ms stddev 4.760 progress: 4.0 s, 221095.6 tps, lat 0.410 ms stddev 4.390 progress: 5.0 s, 217430.6 tps, lat 0.454 ms stddev 7.913 progress: 6.0 s, 210275.9 tps, lat 0.411 ms stddev 0.606 BufferDescriptors is at 0x7f1718aeb980 (which is 64byte aligned) So, should we increase ALIGNOF_BUFFER from 32 to 64? Seems like that's what these results are telling us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Better way of dealing with pgstat wait timeout during buildfarm runs?
On Thu, Dec 25, 2014 at 09:14:36PM +0100, Andres Freund wrote: My guess is that a checkpoint happened at that time. Maybe it'd be a good idea to make pg_regress start postgres with log_checkpoints enabled? My guess is that we'd find horrendous 'sync' times. +1, independent of $SUBJECT. How about log_autovacuum_min_duration=0, too? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nls and server log
On Wed, Dec 24, 2014 at 1:35 PM, Euler Taveira eu...@timbira.com.br wrote: Currently the same message goes to server log and client app. Sometimes it bothers me since I have to analyze server logs and discovered that lc_messages is set to pt_BR and to worse things that stup^H^H^H application parse some error messages in portuguese. My solution has been a modified version of pgBadger (former was pgfouine) -- that has its problems: (i) translations are not as stable as english messages, (ii) translations are not always available and it means there is a mix of translated and untranslated messages and (iii) it is minor version dependent. I'm tired to fight against those problems and started to research if there is a good solution for backend. I'm thinking to carry both translated and untranslated messages if we ask to. We store the untranslated messages if the new GUC (say server_lc_messages) is set. The cost will be copy to new five variables (message, detail, detail_log, hint, and context) in ErrorData struct that will be used iif server_lc_messages is set. A possible optimization is not to use the new variables if the lc_messages and server_lc_messages does not match. My use case is a server log in english but I'm perfect fine allowing server log in spanish and client messages in french. Is it an acceptable plan? Ideas? Seems reasonable to me, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 12/27/2014 12:16 AM, Alvaro Herrera wrote: Tom Lane wrote: The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to they use PGSTAT_STAT_INTERVAL like everybody else? The point of wanting fresher stats than that, eons ago, was to avoid a worker vacuuming a table that some other worker vacuumed more recently than PGSTAT_STAT_INTERVAL. ... Nowadays we can probably disregard the whole issue, since starting a new vacuum just after the prior one finished should not cause much stress to the system thanks to the visibility map. Vacuuming is far from free, even if the visibility map says that most pages are visible to all: you still scan all indexes, if you remove any dead tuples at all. With typical autovacuum settings, I kinda doubt that there's much value in reducing the window for this problem from 500ms to 10ms. As Alvaro says, this was just a partial, kluge solution from the start --- if we're worried about such duplicate vacuuming, we should undertake a real solution that closes the window altogether. In any case, timeouts occurring inside autovacuum are not directly causing the buildfarm failures, since autovacuum's log entries don't reflect into regression outputs. (It's possible that autovacuum's tight tolerance is contributing to the failures by increasing the load on the stats collector, but I'm not sure I believe that.) To get back to that original complaint about buildfarm runs failing, I notice that essentially all of those failures are coming from wait timeout warnings reported by manual VACUUM commands. Now, VACUUM itself has no need to read the stats files. What's actually causing these messages is failure to get a timely response in pgstat_vacuum_stat(). So let me propose a drastic solution: let's dike out this bit in vacuum.c: /* * Send info about dead objects to the statistics collector, unless we are * in autovacuum --- autovacuum.c does this for itself. */ if ((vacstmt-options VACOPT_VACUUM) !IsAutoVacuumWorkerProcess()) pgstat_vacuum_stat(); This would have the effect of transferring all responsibility for dead-stats-entry cleanup to autovacuum. For ordinary users, I think that'd be just fine. It might be less fine though for people who disable autovacuum, if there still are any. -1. I don't think it's a good idea to inflict pain on people who want to schedule their vacuums manually (and yes, there are some) to get clean buildfarm runs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Robert Haas robertmh...@gmail.com writes: On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: This would have the effect of transferring all responsibility for dead-stats-entry cleanup to autovacuum. For ordinary users, I think that'd be just fine. It might be less fine though for people who disable autovacuum, if there still are any. -1. I don't think it's a good idea to inflict pain on people who want to schedule their vacuums manually (and yes, there are some) to get clean buildfarm runs. Did you read the rest of it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nls and server log
Robert Haas robertmh...@gmail.com writes: On Wed, Dec 24, 2014 at 1:35 PM, Euler Taveira eu...@timbira.com.br wrote: Currently the same message goes to server log and client app. ... I'm thinking to carry both translated and untranslated messages if we ask to. We store the untranslated messages if the new GUC (say server_lc_messages) is set. The cost will be copy to new five variables (message, detail, detail_log, hint, and context) in ErrorData struct that will be used iif server_lc_messages is set. A possible optimization is not to use the new variables if the lc_messages and server_lc_messages does not match. My use case is a server log in english but I'm perfect fine allowing server log in spanish and client messages in french. Is it an acceptable plan? Ideas? Seems reasonable to me, I think. The core problem that we've worried about in previous discussions about this is what to do about translation failures and encoding conversion failures. That is, there's been worry that a poor choice of log locale could result in failures that don't occur otherwise; failures that could be particularly nasty if they result in the inability to log important conditions, perhaps even prevent reporting them to the client either. While I don't say that we cannot accept any risk of that sort, I think we should consider what risks exist and whether they can be minimized before we plow ahead. It would also be useful to think about the requests we get from time to time to ensure that log messages appear in a uniform choice of encoding. I don't know whether trying to enforce a uniform log message locale would make that easier or harder. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers