Re: [HACKERS] Compression of full-page-writes
On Fri, Aug 30, 2013 at 1:43 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Fri, Aug 30, 2013 at 8:25 AM, Fujii Masao masao.fu...@gmail.com wrote: Hi, Attached patch adds new GUC parameter 'compress_backup_block'. When this parameter is enabled, the server just compresses FPW (full-page-writes) in WAL by using pglz_compress() before inserting it to the WAL buffers. Then, the compressed FPW is decompressed in recovery. This is very simple patch. The purpose of this patch is the reduction of WAL size. Under heavy write load, the server needs to write a large amount of WAL and this is likely to be a bottleneck. What's the worse is, in replication, a large amount of WAL would have harmful effect on not only WAL writing in the master, but also WAL streaming and WAL writing in the standby. Also we would need to spend more money on the storage to store such a large data. I'd like to alleviate such harmful situations by reducing WAL size. My idea is very simple, just compress FPW because FPW is a big part of WAL. I used pglz_compress() as a compression method, but you might think that other method is better. We can add something like FPW-compression-hook for that later. The patch adds new GUC parameter, but I'm thinking to merge it to full_page_writes parameter to avoid increasing the number of GUC. That is, I'm thinking to change full_page_writes so that it can accept new value 'compress'. I measured how much WAL this patch can reduce, by using pgbench. * Server spec CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz Mem: 16GB Disk: 500GB SSD Samsung 840 * Benchmark pgbench -c 32 -j 4 -T 900 -M prepared scaling factor: 100 checkpoint_segments = 1024 checkpoint_timeout = 5min (every checkpoint during benchmark were triggered by checkpoint_timeout) * Result [tps] 1386.8 (compress_backup_block = off) 1627.7 (compress_backup_block = on) [the amount of WAL generated during running pgbench] 4302 MB (compress_backup_block = off) 1521 MB (compress_backup_block = on) This is really nice data. I think if you want, you can once try with one of the tests Heikki has posted for one of my other patch which is here: http://www.postgresql.org/message-id/51366323.8070...@vmware.com Also if possible, for with lesser clients (1,2,4) and may be with more frequency of checkpoint. This is just to show benefits of this idea with other kind of workload. Yep, I will do more tests. I think we can do these tests later as well, I had mentioned because sometime back (probably 6 months), one of my colleagues have tried exactly the same idea of using compression method (LZ and few others) for FPW, but it turned out that even though the WAL size is reduced but performance went down which is not the case in the data you have shown even though you have used SSD, might be he has done some mistake as he was not as experienced, but I think still it's good to check on various workloads. I'd appreciate if you test the patch with HDD. Now I have no machine with HDD. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Behaviour of take over the synchronous replication
On Wed, Aug 28, 2013 at 10:59 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Aug 27, 2013 at 4:51 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Sun, Aug 25, 2013 at 3:21 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Sat, Aug 24, 2013 at 2:46 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Sat, Aug 24, 2013 at 3:14 AM, Josh Berkus j...@agliodbs.com wrote: On 08/23/2013 12:42 AM, Sawada Masahiko wrote: in case (a), those priority is clear. So I think that re-taking over is correct behaviour. OHOT, in case (b), even if AAA and BBB are set same priority, AAA server steals SYNC replication. I think it is better that BBB server continue behaviour SYNC standby, and AAA should become potential server. So, you're saying that: 1) synchronous_standby_names = '*' 2) replica 'BBB' is the current sync standby 3) replica 'AAA' comes online 4) replica 'AAA' grabs sync status ? I'm sorry that you are confuse. It means that 1) synchronous_standby_names = '*' 2) replica 'AAA' is the current sync standby 3) replica 'BBB' is the current async standby (potential sync standby) 4) replica 'AAA' fail. after that, replica 'BBB' is current sync standby. 5) replica 'AAA' comes online 6) replica 'AAA' grabs sync status If that's the case, I'm not really sure that's undesirable behavior. One could argue fairly persuasively that if you care about the precendence order of sync replicas, you shouldn't use '*'. And the rule of if using *, the lowest-sorted replica name has sync is actually a predictable, easy-to-understand rule. So if you want to make this a feature request, you'll need to come up with an argument as to why the current behavior is bad. Otherwise, you're just asking us to document it better (which is a good idea). It is not depend on name of standby server. That is, The standby server, which was connected to the master server during initial configration replication, is top priority even if priority of two server are same. What is happening here is that incase of '*' as priority of both are same, system will choose whichever comes in list of registered standby's first (list is maintained in structure WalSndCtl). Each standby is registered with WalSndCtl when a new WALSender is started in function InitWalSenderSlot(). As 'AAA' has been registered first it becomes preferred sync standby even if priorities of both are same. When 'AAA' goes down, it marks that Slot entry as free (by setting pid=0 in function WalSndKill), now when 'AAA' comes back again, it gets that free Slot entry and again becomes preferred sync standby. Now if we want to fix as you are suggesting which I don't think is necessary, we might need to change WalSndKill and some other place so that whenever any standby goes down, it changes slots for already registered standby's. User must remember that which standby server connected to master server at first. I think that this behavior confuse user. so I think that we need to modify this behaviour or if '*' is used, priority of server is not same (modifying manual is also good). Here user has done the settings (setting synchronous_standby_names = '*'), after which he will not have any control which standby will become sync standby, so ideally he should not complain. It might be case that for some users current behavior is good enough which means that with '*' whichever standby has become sync standby first, it will be the sync standby always if alive. I'm thinking that it is not necessary to change WalSndKill. For example, we add the value (e.g., sync_standby) which have that which wal sender is active SYNC rep. And if sync_standby is already set and it is active, server doesn't looking for active standby. Only if sync_standby is not set and it is inactive, server looking for that which server is active SYNC rep. If so, we also prevent to find active SYNC rep whenever SyncRepReleaseWaiters() is called. For '*' case, it will be okay, but when the user has given proper names, in that case even if there is any active Sync Rep, it has to be changed based on priority. I think here how to provide a fix, so that behavior gets changed to what you describe is a second priority work, first is to show the value of use-case. Do you really know where people actually setup using '*' as configuration and if yes, are they annoyed with current behavior? I have thought about it, but could imagine a scenario where people will be using '*' in their production configurations, may be it will be useful in test labs. Thank you for your feedback. I have implemented the patch which change how to put priority on each walsender, based on I suggested. I added sync_standby value into WalCtl value. This value has that which walsender is active sync rep. This patch handle also for case that user has given proper names. Regards, --- Sawada Masahiko Sync_standby_priority_v1.patch
Re: [HACKERS] Compression of full-page-writes
On Fri, Aug 30, 2013 at 2:32 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/08/30 11:55), Fujii Masao wrote: * Benchmark pgbench -c 32 -j 4 -T 900 -M prepared scaling factor: 100 checkpoint_segments = 1024 checkpoint_timeout = 5min (every checkpoint during benchmark were triggered by checkpoint_timeout) Did you execute munual checkpoint before starting benchmark? Yes. We read only your message, it occuered three times checkpoint during benchmark. But if you did not executed manual checkpoint, it would be different. You had better clear this point for more transparent evaluation. What I executed was: - CHECKPOINT SELECT pg_current_xlog_location() pgbench -c 32 -j 4 -T 900 -M prepared -r -P 10 SELECT pg_current_xlog_location() SELECT pg_xlog_location_diff() -- calculate the diff of the above locations - I repeated this several times to eliminate the noise. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Aug 29, 2013 at 10:55 PM, Fujii Masao masao.fu...@gmail.com wrote: I suppose that the cost of the random I/O involved would probably dominate just as with compress_backup_block = off. That said, you've used an SSD here, so perhaps not. Oh, maybe my description was confusing. full_page_writes was enabled while running the benchmark even if compress_backup_block = off. I've not merged those two parameters yet. So even in compress_backup_block = off, random I/O would not be increased in recovery. I understood it that way. I just meant that it could be that the random I/O was so expensive that the additional cost of decompressing the FPIs looked insignificant in comparison. If that was the case, the increase in recovery time would be modest. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Aug 30, 2013 at 2:37 PM, Nikhil Sontakke nikkh...@gmail.com wrote: Hi Fujii-san, I must be missing something really trivial, but why not try to compress all types of WAL blocks and not just FPW? The size of non-FPW WAL is small, compared to that of FPW. I thought that compression of such a small WAL would not have big effect on the reduction of WAL size. Rather, compression of every WAL records might cause large performance overhead. Also, focusing on FPW makes the patch very simple. We can add the compression of other WAL later if we want. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 30.08.2013 05:55, Fujii Masao wrote: * Result [tps] 1386.8 (compress_backup_block = off) 1627.7 (compress_backup_block = on) It would be good to check how much of this effect comes from reducing the amount of data that needs to be CRC'd, because there has been some talk of replacing the current CRC-32 algorithm with something faster. See http://www.postgresql.org/message-id/20130829223004.gd4...@awork2.anarazel.de. It might even be beneficial to use one routine for full-page-writes, which are generally much larger than other WAL records, and another routine for smaller records. As long as they both produce the same CRC, of course. Speeding up the CRC calculation obviously won't help with the WAL volume per se, ie. you still generate the same amount of WAL that needs to be shipped in replication. But then again, if all you want to do is to reduce the volume, you could just compress the whole WAL stream. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add pgbench option: CHECKPOINT before starting benchmark
Hi, I add checkpoint option to pgbench. pgbench is simple and useful benchmark for every user. However, result of benchmark greatly changes by some situations which are in executing checkpoint, number of dirty buffers in share_buffers, and so on. For such a problem, it is custom to carry out a checkpoint before starting benchmark. But it is a fact that the making of the script takes time, like under following script. psql -U postgres -d pgbench -p5432 -c CHECKPOINT pgbench -T 600 -c 12 -j4 -U postgres -d pgbench -p 5432 However, this script have a problem. This script execute CHECKPOINT - VACUUM - starting benchmark. If relpages have lot of dirty pages, VACUUM generate dirty buffers on shared_buffers, and it will cause bad heavily checkpoint. I think pgbench would be more easy and accuracy benchmark tools for everyone. So I set checkpoint before starting benchmark. This patch's output is here. - [mitsu-ko@localhost pgbench]$ ./pgbench starting vacuum...end. starting checkpoint...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 312.851958 (including connections establishing) tps = 364.524478 (excluding connections establishing) - It execute VACUUM - CHECKPOINT - starting benchmark. I think it is ideal setting for more accuracy benchmark. My patches option difinition is here. [mitsu-ko@localhost pgbench]$ ./pgbench --help ~ -N, --no-checkpoint do not run CHECKPOINT after initialization ~ In latest commited pgbench, -N is --skip-some-updates skip updates of pgbench_tellers and pgbench_branches. But I cannot understand why -N is this option, so I set this option -u, and -N is do not run CHECKPOINT option. What do you think? -- Mitsumasa KONDO NTT Open Source Software Center diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..523c278 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -347,6 +347,7 @@ usage(void) -i, --initialize invokes initialization mode\n -F, --fillfactor=NUM set fill factor\n -n, --no-vacuum do not run VACUUM after initialization\n + -N, --no-checkpoint do not run CHECKPOINT after initialization\n -q, --quiet quiet logging (one message each 5 seconds)\n -s, --scale=NUM scaling factor\n --foreign-keys create foreign key constraints between tables\n @@ -366,7 +367,8 @@ usage(void) protocol for submitting queries (default: simple)\n -n, --no-vacuum do not run VACUUM before tests\n - -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n + -N, --no-checkpoint do not run CHECKPOINT before tests\n + -u, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n -P, --progress=NUM show thread progress report every NUM seconds\n -r, --report-latencies report average latency per command\n -R, --rate=SPEC target rate in transactions per second\n @@ -1520,7 +1522,7 @@ disconnect_all(CState *state, int length) /* create tables and setup data */ static void -init(bool is_no_vacuum) +init(bool is_no_vacuum, bool is_no_checkpoint) { /* The scale factor at/beyond which 32bit integers are incapable of storing * 64bit values. @@ -1775,6 +1777,12 @@ init(bool is_no_vacuum) } } + /* checkpoint */ + if (!is_no_checkpoint) + { + fprintf(stderr, checkpoint...\n); + executeStatement(con, checkpoint); + } fprintf(stderr, done.\n); PQfinish(con); @@ -2213,6 +2221,7 @@ main(int argc, char **argv) {jobs, required_argument, NULL, 'j'}, {log, no_argument, NULL, 'l'}, {no-vacuum, no_argument, NULL, 'n'}, + {no-checkpoint, no_argument, NULL, 'N'}, {port, required_argument, NULL, 'p'}, {progress, required_argument, NULL, 'P'}, {protocol, required_argument, NULL, 'M'}, @@ -2220,7 +2229,7 @@ main(int argc, char **argv) {report-latencies, no_argument, NULL, 'r'}, {scale, required_argument, NULL, 's'}, {select-only, no_argument, NULL, 'S'}, - {skip-some-updates, no_argument, NULL, 'N'}, + {skip-some-updates, no_argument, NULL, 'u'}, {time, required_argument, NULL, 'T'}, {transactions, required_argument, NULL, 't'}, {username, required_argument, NULL, 'U'}, @@ -2240,6 +2249,7 @@ main(int argc, char **argv) int nclients = 1; /* default number of simulated clients */ int nthreads = 1; /* default number of threads */ int is_init_mode = 0; /* initialize mode? */ + int is_no_checkpoint = 0; /* no checkpoint at all before testing? */ int is_no_vacuum = 0; /* no vacuum at all before testing? */ int
Re: [HACKERS] Add pgbench option: CHECKPOINT before starting benchmark
On 30/08/13 19:54, KONDO Mitsumasa wrote: Hi, I add checkpoint option to pgbench. pgbench is simple and useful benchmark for every user. However, result of benchmark greatly changes by some situations which are in executing checkpoint, number of dirty buffers in share_buffers, and so on. For such a problem, it is custom to carry out a checkpoint before starting benchmark. But it is a fact that the making of the script takes time, like under following script. psql -U postgres -d pgbench -p5432 -c CHECKPOINT pgbench -T 600 -c 12 -j4 -U postgres -d pgbench -p 5432 However, this script have a problem. This script execute CHECKPOINT - VACUUM - starting benchmark. If relpages have lot of dirty pages, VACUUM generate dirty buffers on shared_buffers, and it will cause bad heavily checkpoint. I think pgbench would be more easy and accuracy benchmark tools for everyone. So I set checkpoint before starting benchmark. This patch's output is here. - [mitsu-ko@localhost pgbench]$ ./pgbench starting vacuum...end. starting checkpoint...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 312.851958 (including connections establishing) tps = 364.524478 (excluding connections establishing) - It execute VACUUM - CHECKPOINT - starting benchmark. I think it is ideal setting for more accuracy benchmark. My patches option difinition is here. [mitsu-ko@localhost pgbench]$ ./pgbench --help ~ -N, --no-checkpoint do not run CHECKPOINT after initialization ~ In latest commited pgbench, -N is --skip-some-updates skip updates of pgbench_tellers and pgbench_branches. But I cannot understand why -N is this option, so I set this option -u, and -N is do not run CHECKPOINT option. What do you think? +1 I have been using a script to add CHECKPOINT before pgbench runs for ages...adding the option to pgbench is a great idea (wish I had thought of it)! Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
Le jeudi 29 août 2013 18:42:13 Stephen Frost a écrit : On Thursday, August 29, 2013, Andres Freund wrote: If you don't want your installation to use it, tell you ops people not to do so. They are superusers, they need to have the ability to follow some rules you make up internally. The OPs people are the ones that will be upset with this because the DBAs will be modifying configs which OPs rightfully claim as theirs. If they have a way to prevent it then perhaps it's not terrible but they'd also need to know to disable this new feature. As for ALTER DATABASE- I would be happier with encouraging use of that (or providing an ALTER CLUSTER) for those thing it makes sense and works for and removing those from being in postgresql.conf. I still feel things like listen_addresses shouldn't be changed through this. ALTER ROLE ALL may be good enougth to handle every GUC that we can also remove from postgresql.conf (I suppose all GUC needing only a reload, not a restart). It may needs some improvement to handle changing default for ALL and adding new role. The energy wasted in a good part of this massive 550+ messages thread is truly saddening. We all (c|sh)ould have spent that time making PG more awesome instead. Perhaps not understood by all, but keeping PG awesome involves more than adding every feature proposed- it also means saying no sometimes; to features, to new GUCs, even to micro-optimizations when they're overly complicated and offer only minimal or questionable improvements. Agreed, the current feature and proposal does not include pg_reload, and it introduces a full machinery we absolutely don't need. Grammar can be added later when the feature is stable. So far, we can achieve the goal by using adminpack, by using a file_fdw or a config_fdw. IMHO it is the job of a FDW to be able to handle atomic write or anything like that. I've commented one of the proposed patch adding some helpers to validate GUC change, I claimed this part was good enough to be added without ALTER SYSTEM (so a contrib can use it). -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Add pgbench option: CHECKPOINT before starting benchmark
My patches option difinition is here. [mitsu-ko@localhost pgbench]$ ./pgbench --help ~ -N, --no-checkpoint do not run CHECKPOINT after initialization ~ In latest commited pgbench, -N is --skip-some-updates skip updates of pgbench_tellers and pgbench_branches. But I cannot understand why -N is this option, so I set this option -u, and -N is do not run CHECKPOINT option. What do you think? Although I agree that the -N is not really meaningful, ISTM that changing option names is to be avoided if people are used to it. In this instance, I'm not sure that many people use this option, so maybe this is not an issue. The patch should also update the sgml documentation. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.3 doc fix] clarification of Solaris versions
On 29/08 21.17, MauMau wrote: Thanks. I belive PostgreSQL runs successfully on Solaris 10 and later, because the binaries are published on the community site: http://ftp.postgresql.org/pub/binary/v9.3beta2/solaris/ Sorry, I didn't notice this thread earlier. Yes, I am building those binaries and I have had no problems building PostgreSQL and running the regression tests on Solaris 11. It would be quite unusual for something to work on Solaris 10 but not 11, but of course it could happen. There may have been a case or two which I now can't remember. - Bjorn Munch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout logging
On 6 June 2013 17:28, Pavel Stehule pavel.steh...@gmail.com wrote: 2013/6/6 Thom Brown t...@linux.com: Hi, When a statement is cancelled due to it running for long enough for statement_timeout to take effect, it logs a message: ERROR: canceling statement due to statement timeout However, it doesn't log what the timeout was at the time of the cancellation. This may be set in postgresql.conf, the database, or on the role, but unless log_line_prefix is set to show the database name and the user name, there's no reliable way of finding out what context the configuration applied from. Setting log_duration won't help either because that only logs the duration of completed queries. Should we output the statement_timeout value when a query is cancelled? +1 we use same feature in GoodData. Our long queries are cancelled by users and we should to known how much a users would to wait. It seems there are a couple more errors that could share this sort of information too: canceling authentication due to timeout canceling statement due to lock timeout Admittedly the first of those two isn't really an issue. -- Thom
Re: [HACKERS] [v9.4] row level security
2013/8/29 Josh Berkus j...@agliodbs.com: Kaigai, Although I didn't touch this task by myself, my senior colleague said that we calculated some possible bandwidth of leakage as an evident when we ship supercomputer system with MAC feature for customer who worry about security. I'm not sure how to measure it. However, if we assume a human can run up to 5 query per seconds, he needs 2-3 seconds to identify a particular integer value less than 1, it means bandwidth of this covert channel is less than 5bps. I'm not sure whether enterprise-grade dbms has to care about these amount of covert channel. Why are you assuming a human needs to do it? Given the explain vector, I could write a rather simple python or perl script which would find values by EXPLAIN leakage, at 1000 explain plans per minute. It's one thing to day we can't solve this covert channel issue right now in this patch, but saying we don't plan to solve it at all is likely to doom the patch. I'm not sure what the solution would be, exactly. Deny permission for EXPLAIN on certain tables? Surely someone in the security community has discussed this? Security community considers covert channel is a hard to solve problem; nearly impossible to eliminate. Let's see the summary in wikipedia: http://en.wikipedia.org/wiki/Covert_channel It does not require countermeasure of covert channels in middle or entry class security evaluation; that is usually required for enterprise grade, even though it is required for the product being designed for military grade. The reason why its priority is relatively lower, is that degree of threats with information leakage via covert channel has limited bandwidth in comparison to main channel. I also follow this standpoint; that is enough reasonable between functionality and its strictness under limited resources. Even if we could close a certain channel, we never can all other channels, like a signal by namespace contention on table creation as covert channel. Also, I don't know major commercial dbms handles these scenario well. Of course, it should be described in the document for users not to apply these security features onto the region that overs our capability. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
2013/8/29 David Fetter da...@fetter.org: On Thu, Aug 29, 2013 at 10:05:14AM -0400, Tom Lane wrote: Alexander Korotkov aekorot...@gmail.com writes: On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It is out of scope for this feature. We usually calls this type of information leakage covert channel; that is not avoidable in principle. I think there is another covert channel much more serious than constrains. You can gather information about hidden data by reading query plans. I'm not convinced by this argument that covert channels are out of scope. That would be a fine justification for, say, a thesis topic. However, what we're talking about here is a real-world feature that will be of no real-world use if it can't stand up against rather obvious attack techniques. I'm not interested in carrying the maintenance and runtime overhead of a feature that's only of academic value. Looking at the real-world perspective, what covert channels do our competitors in the space currently claim to do anything about? I'm not sure whether minor dbms that is designed for extreme secure environment already got certified. (If they have such functionality, they should take certification for promotion.) Oracle lists some of their certified products: http://www.oracle.com/technetwork/topics/security/security-evaluations-099357.html However, these are based on protection profile for basic robustness that is designed for environment where we don't care about covert channel. This would represent the bar we need to clear at least as far as documenting what we do (do the access constraint before anything else, e.g.) or why we don't do things (disabling EXPLAIN, e.g.). +1. I'd like to add description about this scenario. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
2013/8/29 Tom Lane t...@sss.pgh.pa.us: Josh Berkus j...@agliodbs.com writes: That would close only one covert channel. Others were already pointed out upthread, and I'll bet there are more ... Mind you, fundamentally this is no different from allowing INSERT permission on a table but denying SELECT, or denying SELECT on certain columns. In either case, covert channels for some data are available. Certainly. But INSERT's purpose in life is not to prevent people from inferring what data is in the table. What we have to ask here is whether a row level security feature that doesn't deal with these real-world attack techniques is worth having. I think, we should clearly note that row-level security feature does not have capability to control information leakage via covert channel but very limited bandwidth, even though it control information leakage and manipulation via main channel. It depends on user's environment and expectation. If they need rdbms with security feature for military grade, it is not recommendable. However, it is a recommended solution for regular enterprise grade environment. Anything depends on user's environment, threats and worth of values to be protected. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
Stephen Frost sfr...@snowman.net writes: The OPs people are the ones that will be upset with this because the DBAs will be modifying configs which OPs rightfully claim as theirs. If that's the problem you want to solve, there's no technical solution that will put you at ease. That's a people and trust problem. I don't think typical (or less typical) organisation should drive our technical choices too much, and I'm pretty confident they didn't in the past: pg_hba.conf is a file not because it's meant for this or that team but because it makes sense technically to manage the settings to allow using some resources *outside* of said resources. We currently have no way that I know of to disable ALTER ROLE SET and ALTER DATABASE SET effects, why do we need to provide that feature for ALTER SYSTEM SET so much? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
Hi, On 2013-08-30 10:20:48 +0200, Cédric Villemain wrote: The energy wasted in a good part of this massive 550+ messages thread is truly saddening. We all (c|sh)ould have spent that time making PG more awesome instead. Perhaps not understood by all, but keeping PG awesome involves more than adding every feature proposed- it also means saying no sometimes; to features, to new GUCs, even to micro-optimizations when they're overly complicated and offer only minimal or questionable improvements. Agreed, the current feature and proposal does not include pg_reload, and it introduces a full machinery we absolutely don't need. The complexity in the last version of the patch I looked at wasn't in the added grammar or pg_reload() (well, it didn't have that). It was the logic to read (from memory)/write the config file and validate the GUCs. That's needed even if you put it into some module. And it requires support from guc.c/guc-file.l Grammar can be added later when the feature is stable. Could you explain the advantages of this? It will require users to get used to different interfaces and we will end up maintaining both just about forever. And I don't see the grammar being that contentious? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote: Stephen Frost sfr...@snowman.net writes: The OPs people are the ones that will be upset with this because the DBAs will be modifying configs which OPs rightfully claim as theirs. If that's the problem you want to solve, there's no technical solution that will put you at ease. That's a people and trust problem. I really just don't buy that- I've already put forward suggestions for how to deal with it, but no one here seems to understand the distinction. Modifying listen_addresses through ALTER SYSTEM is akin to ISC/bind allowing changes to its listen_addresses equivilant through dynamic DNS updates. Would it be possible to implement? Sure. Does it make any sense? Certainly not. I don't think typical (or less typical) organisation should drive our technical choices too much, and I'm pretty confident they didn't in the past: pg_hba.conf is a file not because it's meant for this or that team but because it makes sense technically to manage the settings to allow using some resources *outside* of said resources. I'm all for moving pg_hba.conf into the database properly, actually. We already handle permissions and user access in the DB and that's one of the DBA's responsibilities. The same goes for pg_ident.conf. We currently have no way that I know of to disable ALTER ROLE SET and ALTER DATABASE SET effects, why do we need to provide that feature for ALTER SYSTEM SET so much? Because we've got crap mixed into postgresql.conf which are bootstrap configs needed to get the system started. Those things, in my view anyway, fall much more into the category of resources which should be managed outside the database than pg_hba.conf. Thanks, Stephen signature.asc Description: Digital signature
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Cédric Villemain (ced...@2ndquadrant.com) wrote: ALTER ROLE ALL may be good enougth to handle every GUC that we can also remove from postgresql.conf (I suppose all GUC needing only a reload, not a restart). It may needs some improvement to handle changing default for ALL and adding new role. Yes, one of the issues with the existing system is that you can't specify a default to be applied to new roles. Also, there are parameters which are not per-role yet which it probably makes sense to be in the database and we'd need a way to deal with that. Although, at the same time, considering it role based does make for a nice distinction. Unfortunately, those clammoring for this will argue that it wouldn't replace adminpack and that they couldn't use it to modify their /etc/network/interfaces file, which is the obvious next step to all of this. Grammar can be added later when the feature is stable. I tend to agree w/ Andres on this point- the grammar isn't really the contentious part. I think I see where you were going with this in that excluding the grammar makes it able to live as a module, but that's a different consideration. I've commented one of the proposed patch adding some helpers to validate GUC change, I claimed this part was good enough to be added without ALTER SYSTEM (so a contrib can use it). Perhaps.. Thanks, Stephen signature.asc Description: Digital signature
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-08-30 10:20:48 +0200, Cédric Villemain wrote: Grammar can be added later when the feature is stable. Could you explain the advantages of this? It will require users to get used to different interfaces and we will end up maintaining both just about forever. And I don't see the grammar being that contentious? The different interfaces issue already exists, to some extent, as adminpack does exist already and this is clearly a different interface from that. That said, as I mentioned just now elsewhere, I agree that the grammar itself (being rather simple anyway...) isn't the contentious part of this. Thanks, Stephen signature.asc Description: Digital signature
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On 2013-08-30 08:48:21 -0400, Stephen Frost wrote: * Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote: Stephen Frost sfr...@snowman.net writes: The OPs people are the ones that will be upset with this because the DBAs will be modifying configs which OPs rightfully claim as theirs. If that's the problem you want to solve, there's no technical solution that will put you at ease. That's a people and trust problem. I really just don't buy that- I've already put forward suggestions for how to deal with it, but no one here seems to understand the distinction. Modifying listen_addresses through ALTER SYSTEM is akin to ISC/bind allowing changes to its listen_addresses equivilant through dynamic DNS updates. Would it be possible to implement? Sure. Does it make any sense? Certainly not. I very much want to change stuff like wal_level, listen_addresses and shared_buffers via ALTER SYSTEM. Configuration variables like that (PGC_POSTMASTER stuff mostly) are the prime reason why you actually need to change postgresql.conf instead of changing per user/database settings. And you don't even need to do anything special to implement it. Because it's already there. We currently have no way that I know of to disable ALTER ROLE SET and ALTER DATABASE SET effects, why do we need to provide that feature for ALTER SYSTEM SET so much? Because we've got crap mixed into postgresql.conf which are bootstrap configs needed to get the system started. Those things, in my view anyway, fall much more into the category of resources which should be managed outside the database than pg_hba.conf. I think the problem with your position in this thread is that you want to overhaul the way our configuration works in a pretty radical way. Which is fair enough, there certainly are deficiencies. But it's not the topic of this thread. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On 2013-08-29 21:26:48 -0400, Stephen Frost wrote: Sure, you can construct a scenario where this matters. The ops guys have sudo postgres pg_ctl access but adminpack isn't installed and they have no other way to modify the configuration file. But that's just bizarre. And if that's really the environment you have, then you can install a loadable module that grabs ProcessUtility_hook and uses it to forbid ALTER SYSTEM on that machine. Hell, we can ship such a thing in contrib. Problem solved. But it's surely too obscure a combination of circumstances to justify disabling this by default. It's not the OPs guy that I'm worried about using ALTER SYSTEM- I don't expect them to have any clue about it or care about it, except where it can be used to modify things under /etc which they, rightfully, consider their domain. I think for the scenarios you describe it makes far, far much more sense to add the ability to easily monitor for two things: * on-disk configuration isn't the same as the currently loaded (not trivially possible yet) * Configuration variables only come from locations that are approved for in your scenario (Already possible, we might want to make it even easier) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Pavel Stehule pavel.steh...@gmail.com writes: I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Thu, Aug 29, 2013 at 9:12 PM, Stephen Frost sfr...@snowman.net wrote: You will not find me argueing to allow that in normal operation, or most direct-catalog hacking. I'd be much happier if we had all the ALTER, etc, options necessary to prevent any need to ever touch the catalogs. Similairly, it'd be nice to have more permission options which reduce the need for anyone to be superuser. Sure, I agree. But you can't burden this patch with the job of conforming to what you and I think the project policy ought to be, but isn't. Right now, as things stand today, the superuser is one step short of God Almighty. The only way in which we even attempt to restrict the superuser is to try to keep her from hijacking the postgres login account. But we don't even try to do that very thoroughly, as has recently been pointed out on other threads; there are several plausible attack vectors for her to do just that. This patch fits right into that existing philosophy. If we take this patch, and I think we should, and later change our policy, then the permissions around this may require adjustment to fit the new policy. Fine! But that policy change is properly a separate discussion. Now, you can argue that people are more likely to render the database nonfunctional by changing GUC settings than they are to do it by corrupting the system catalogs, but I'm not sure I believe it. We can, after all, validate that any setting a user supplies is a valid value for that setting before writing it out to the configuration file. It might still make the system fail to start if - for example - it causes too many semaphores to be reserved, or something like that. But why should we think that such mistakes will be common? If anything, it sounds less error-prone to me than hand-editing the configuration file, where typing something like on_exit_error=maybe will make the server fail to start. We can eliminate that whole category of error, at least for people who choose to use the new tools. That category of error is much more likely to get caught by the sysadmin doing the restart after the update.. If you edit postgresql.conf manually today, you will have no warning of ANY sort of error you might make. With a feature like this, we can catch a large subset of things you might do wrong *before we even modify the file*. Yes, there will be some that slip through, but an imperfect mechanism for catching mistakes is better than no mechanism at all. ALTER SYSTEM doesn't provide any way to restart the DB to make sure it worked. That, in turn, encourages modification of the config parameters w/o a restart, which is a pretty bad idea, imv. vi doesn't provide any way to restart the DB to make sure it worked, either, and never will. However, ALTER SYSTEM could be extended in exactly that way: we could have ALTER SYSTEM RESTART. Of course some people are likely to argue that's a bad idea, so we'd have to have that discussion and think carefully about the issues, but there's nothing intrinsic that restricts us from giving ALTER SYSTEM any arbitrary set of capabilities we might want it to have. Even if we don't ever do that, we're no worse off than today. Well, with one exception. If we make it easier to modify the configuration file, then people are more likely to do it, and thus more likely to do it wrong. But you could use that argument against any change that improves ease-of-use. And ease-of-use is a feature, not a bug. If you're using the term dangerous to refer to a security exposure, I concede there is some incremental exposure from allowing this by default. But it's not a terribly large additional exposure. It's already the case that if adminpack is available the super-user can do whatever he or she wants, because she or he can write to arbitrary files inside the data directory. This is only true if -contrib is installed, which a *lot* of admins refuse to do, specifically because of such insane modules as this. Therefore, I would argue that it's not nearly as small a change wrt exposure as you believe. Strictly speaking, I don't believe it's a huge security hole or something, but I do think it's going to surprise admins who aren't following the lists. If we take the patch, it will be in the release notes, just like any other feature. I suspect there will be more people *pleasantly* surprised than anything else. If there's anything I've learned about ease-of-use as a 10+-year veteran of PostgreSQL, it's that being able to do everything via a database connection is really, really, really convenient. That's why foreign data wrappers are awesome. Instead of installing a foreign data wrapper for Oracle and pointing it at an Oracle server and then sucking data down over the link to store or process or whatever, you put all that logic in a client which knows how to talk to both PostgreSQL and Oracle. It turns out, though, that having that capability *in the server* is really
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-08-30 08:48:21 -0400, Stephen Frost wrote: I really just don't buy that- I've already put forward suggestions for how to deal with it, but no one here seems to understand the distinction. Modifying listen_addresses through ALTER SYSTEM is akin to ISC/bind allowing changes to its listen_addresses equivilant through dynamic DNS updates. Would it be possible to implement? Sure. Does it make any sense? Certainly not. I very much want to change stuff like wal_level, listen_addresses and shared_buffers via ALTER SYSTEM. Configuration variables like that (PGC_POSTMASTER stuff mostly) are the prime reason why you actually need to change postgresql.conf instead of changing per user/database settings. wal_level and shared_buffers I can buy, but listen_addresses? The most typical change there is going from localhost - '*', but you've got to be on the box to do that. Anything else and you're going to need to be adding interfaces to the box anyway and hacking around in /etc/network/interfaces or what-have-you. Because we've got crap mixed into postgresql.conf which are bootstrap configs needed to get the system started. Those things, in my view anyway, fall much more into the category of resources which should be managed outside the database than pg_hba.conf. I think the problem with your position in this thread is that you want to overhaul the way our configuration works in a pretty radical way. Which is fair enough, there certainly are deficiencies. But it's not the topic of this thread. You and Robert both seem to be of the opinion that this hack which brings postgresql.conf into the database via ALTER SYSTEM is a-ok because it's moving us forward in someone's mind, but it really is developing a system configuration management system which *looks* like a first-class citizen when it actually falls woefully short of that. There is absolutely no question in my mind that this will be a huge support pain, from the first ALTER SYSTEM SET shared_buffers = blah; SHOW shared_buffers; to the why can't my database start?!? it's complaining it can't allocate memory but I keep changing postgresql.conf and nothing works! I'm simply not convinced that this is moving us forward nor that we will end up with more benefit than pain from it. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] WAL CPU overhead/optimization (was Master-slave visibility order)
On Fri, Aug 30, 2013 at 03:22:37AM +0300, Ants Aasma wrote: On Fri, Aug 30, 2013 at 3:02 AM, Andres Freund and...@2ndquadrant.com wrote: I am not sure hot cache large buffer performance is really the interesting case. Most of the XLogInsert()s are pretty small in the common workloads. I vaguely recall trying 8 and getting worse performance on many workloads, but that might have been a problem of my implementation. Slice-by-8 doesn't have any overhead for small buffers besides the lookup tables, so it most likely the cache misses that were the issue. Murmur3, CityHash and SpookyHash don't have any lookup tables and are excellent with small keys. Especially CityHash, 64 byte hash is quoted at 9ns. The reason I'd like to go for a faster CRC32 implementation as a first step is that it's easy. Easy to verify, easy to analyze, easy to backout. I personally don't have enough interest/time in the 9.4 cycle to purse conversion to a different algorithm (I find the idea of using different ones on 32/64bit pretty bad), but I obviously won't stop somebody else ;) I might give it a shot later this cycle as I have familiarized my self with the problem domain anyway. I understand the appeal of staying with what we have, but this would cap the speedup at 4x and has large caveats with the extra lookup tables. A 28x speedup might be worth the extra effort. Regards, Ants Aasma You may want to also check out xxhash with a BSD License and very fast 32-bit performance as well: http://fastcompression.blogspot.com/2012/04/selecting-checksum-algorithm.html http://code.google.com/p/xxhash/ FWIW I agree that a much faster function would be better for CPU overhead. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-08-29 21:26:48 -0400, Stephen Frost wrote: It's not the OPs guy that I'm worried about using ALTER SYSTEM- I don't expect them to have any clue about it or care about it, except where it can be used to modify things under /etc which they, rightfully, consider their domain. I think for the scenarios you describe it makes far, far much more sense to add the ability to easily monitor for two things: * on-disk configuration isn't the same as the currently loaded (not trivially possible yet) That would certainly help. I don't know that it needs to be technically 'trivial', but at least having check_postgres.pl or something which can alert on it would be an improvement. Clearly, that would be valuable today (assuming it doesn't already exist somewhere.. it might). * Configuration variables only come from locations that are approved for in your scenario (Already possible, we might want to make it even easier) That an interesting notion; do you have something specific in mind..? The easiest, imv anyway, would be that options set in postgresql.conf can't be overridden, but that gets us into the bootstrap problem that people seem to be concerned about. It would also be a change to how postgresql.conf is parsed today which some people would be annoyed by. Having some configuration option which says what can be modified by alter system doesn't strike me as a terribly good solution either. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ENABLE/DISABLE CONSTRAINT NAME
wangs...@highgo.com.cn writes: In order to achieve enable/disable constraint nameï¼I made ââa few modifications to the code. First, someone used to build the constraints while building table. Then inserting data must follow a certain order. And people usually like to insert the data but not affected by foreign keys or check. Second, the check or the foreign key constraint will waste much time while inserting the data into the table. Due to the above reasonsï¼I realized this command. Uh ... why not just drop the constraint, and re-add it later if you want it again? This seems like adding a lot of mechanism (and possible bugs) for a rather marginal use-case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On 2013-08-30 09:19:42 -0400, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2013-08-30 08:48:21 -0400, Stephen Frost wrote: I really just don't buy that- I've already put forward suggestions for how to deal with it, but no one here seems to understand the distinction. Modifying listen_addresses through ALTER SYSTEM is akin to ISC/bind allowing changes to its listen_addresses equivilant through dynamic DNS updates. Would it be possible to implement? Sure. Does it make any sense? Certainly not. I very much want to change stuff like wal_level, listen_addresses and shared_buffers via ALTER SYSTEM. Configuration variables like that (PGC_POSTMASTER stuff mostly) are the prime reason why you actually need to change postgresql.conf instead of changing per user/database settings. wal_level and shared_buffers I can buy, but listen_addresses? The most typical change there is going from localhost - '*', but you've got to be on the box to do that. Anything else and you're going to need to be adding interfaces to the box anyway and hacking around in /etc/network/interfaces or what-have-you. Even if it requires to be on the box locally, I only need libpq for it. And it's not infrequent to allow additional, already configured interfaces. And even if not, what's the point of prohibiting listen_interfaces specifically? When the other very interesting variables have the same dangers? Doing this on a variable-by-variable basis will a) be a ridiculous amount of effort, b) confuse users which may not share our judgement of individual variables. Because we've got crap mixed into postgresql.conf which are bootstrap configs needed to get the system started. Those things, in my view anyway, fall much more into the category of resources which should be managed outside the database than pg_hba.conf. I think the problem with your position in this thread is that you want to overhaul the way our configuration works in a pretty radical way. Which is fair enough, there certainly are deficiencies. But it's not the topic of this thread. You and Robert both seem to be of the opinion that this hack which brings postgresql.conf into the database via ALTER SYSTEM is a-ok because it's moving us forward in someone's mind, but it really is developing a system configuration management system which *looks* like a first-class citizen when it actually falls woefully short of that. It's what plenty of people want and it doesn't hurt people who do not want it. Yes. I think that's a step forward. There is absolutely no question in my mind that this will be a huge support pain, from the first ALTER SYSTEM SET shared_buffers = blah; SHOW shared_buffers; to the why can't my database start?!? it's complaining it can't allocate memory but I keep changing postgresql.conf and nothing works! I'm simply not convinced that this is moving us forward nor that we will end up with more benefit than pain from it. That will not show the changed shared_buffers. And it (afair) will throw a WARNING that shared_buffers couldn't be adjusted at this point. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Thu, Aug 29, 2013 at 9:26 PM, Stephen Frost sfr...@snowman.net wrote: In the first place, modifying postgresql.conf and not immediately restarting the server to test your changes is probably the single most basic DBA error imaginable. You're not modifying postgresql.conf with ALTER SYSTEM, now are you? Admins are going to realize the need to restart or at least reload the service after updating such, but a DBA who has focused mainly on normalization, query optimization, etc, isn't going to have the same experience that a sysadmin has. I refuse to reject any feature on the basis of the possibility that people might use it without reading the documentation. Nearly anything anyone will ever propose is so idiot-proof that we can't conceive of a scenario in which someone shoots their foot off with it. The question is whether the usefulness of the feature exceeds, by a sufficient amount, the harm it's likely to do, and I think the answer is clearly yes. A DBA updating a GUC, something they are likely to do frequently day-in and day-out, isn't necessairly going to consider that it's a reboot-needing change. Indeed, it's not unreasonable to consider that we may, some day, actually be able to change or increase shared_buffers on the fly (or error in the trying); I'd think your work with the dynamic backends would actually make that likely to happen sooner rather than later. I wouldn't hold my breath. We have way too many separate, variable-length data structures in shared memory. Increasing shared_buffers means making the lwlock array bigger, making the buffer header array bigger, allocating more space for the buffer mapping hash, etc. I doubt we'd accept a design that involves each of those data structures being a separate shared memory segment, but if they're all in the same segment one after another, then you can't easily extend them on the fly. There are also a lot of problems around unmapping-and-remapping a segment. If the remap fails, it's going to be at least a FATAL, but if you had any shared memory state in the unmapped segment (like a buffer pin) that has to be unwound, you have to PANIC; there's no other way to fix it. My initial design for dynamic shared memory allowed for an unbounded number of dynamic shared memory segments by growing the control segment on the fly. However, this introduced PANIC hazards in corner cases that I couldn't get rid of, so now there's a fairly generous but fixed-at-startup-time limit on how many segments you can have. In practice I don't think this matters much, but it was a sobering reminder that the main shared memory segment, with all of its inflexibility, has important reliability properties that are hard to replicate in more dynamic scenarios. It's not the OPs guy that I'm worried about using ALTER SYSTEM- I don't expect them to have any clue about it or care about it, except where it can be used to modify things under /etc which they, rightfully, consider their domain. Under the currently-proposed design, it can't be used to do any such thing. It can only be used to modify some auto.conf file which lives in $PGDATA. It's therefore no different from the ops perspective than ALTER DATABASE or ALTER USER - except that it allows all settings to be changed rather than only a subset. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Tom Lane-2 wrote Pavel Stehule lt; pavel.stehule@ gt; writes: I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. In the example case the problem is that ORDER BY constant is a valid, if not-very-useful, construct. Can we warn on this specific usage and thus mitigate many of the potential avenues of mis-use? If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Variadic-aggregates-vs-project-policy-tp5768980p5769106.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Fri, Aug 30, 2013 at 8:53 AM, Stephen Frost sfr...@snowman.net wrote: Yes, one of the issues with the existing system is that you can't specify a default to be applied to new roles. Also, there are parameters which are not per-role yet which it probably makes sense to be in the database and we'd need a way to deal with that. Although, at the same time, considering it role based does make for a nice distinction. Unfortunately, those clammoring for this will argue that it wouldn't replace adminpack and that they couldn't use it to modify their /etc/network/interfaces file, which is the obvious next step to all of this. This is a straw-man. adminpack doesn't allow reading or writing files outside of the configured data and log directories, as a security precaution. But suppose it did. If your permissions on /etc/network/interfaces allow the postgres user to modify it, then you pretty much deserve exactly what you get. Likening this to the feature being proposed is silly. What is being asked for here is the ability to easily modify system-wide settings from the SQL prompt, just as today you can modify settings per-user or per-database. That's not the same thing as rewriting the entire system configuration. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-08-30 09:19:42 -0400, Stephen Frost wrote: wal_level and shared_buffers I can buy, but listen_addresses? The most typical change there is going from localhost - '*', but you've got to be on the box to do that. Anything else and you're going to need to be adding interfaces to the box anyway and hacking around in /etc/network/interfaces or what-have-you. Even if it requires to be on the box locally, I only need libpq for it. And it's not infrequent to allow additional, already configured interfaces. And even if not, what's the point of prohibiting listen_interfaces specifically? When the other very interesting variables have the same dangers? I'd like to see those dangers removed from the other very interesting variables. We're making progress towards that, for example with shared_buffers. I've commented on that already up-thread. Hell, you could even make things such that PG would start w/ a misconfigured listen_addresses- but if we don't like that then I would argue that it shouldn't be included here. I'm not entirely sure how wal_level has the same danger as the others.. Doing this on a variable-by-variable basis will a) be a ridiculous amount of effort, b) confuse users which may not share our judgement of individual variables. I don't think the effort involved is nearly as much as you claim and we already have the issue that users don't like our choices around what can be reconfigured on the fly and what can't (perhaps they understand that there are technical challenges to some of them, but that doesn't make them agree with them). You and Robert both seem to be of the opinion that this hack which brings postgresql.conf into the database via ALTER SYSTEM is a-ok because it's moving us forward in someone's mind, but it really is developing a system configuration management system which *looks* like a first-class citizen when it actually falls woefully short of that. It's what plenty of people want and it doesn't hurt people who do not want it. Yes. I think that's a step forward. It will be quite interesting to see if people decide they really wanted this once they actually *get* it. There is absolutely no question in my mind that this will be a huge support pain, from the first ALTER SYSTEM SET shared_buffers = blah; SHOW shared_buffers; to the why can't my database start?!? it's complaining it can't allocate memory but I keep changing postgresql.conf and nothing works! I'm simply not convinced that this is moving us forward nor that we will end up with more benefit than pain from it. That will not show the changed shared_buffers. And it (afair) will throw a WARNING that shared_buffers couldn't be adjusted at this point. Not showing the change is what I was getting at. As has been said elsewhere, throwing a warning on every interesting invokation of a command can speak to it not being exactly 'ready'. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Add pgbench option: CHECKPOINT before starting benchmark
KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp writes: My patches option difinition is here. [mitsu-ko@localhost pgbench]$ ./pgbench --help ~ -N, --no-checkpoint do not run CHECKPOINT after initialization ~ In latest commited pgbench, -N is --skip-some-updates skip updates of pgbench_tellers and pgbench_branches. But I cannot understand why -N is this option, so I set this option -u, and -N is do not run CHECKPOINT option. What do you think? I think it's a bad idea to change the meaning of a pre-existing option letter, and a worse idea to change pgbench's default behavior. Instead, have the option be --checkpoint do a CHECKPOINT after initialization. It doesn't look like there's any free single-letter option that goes nicely with checkpoint either, but that's life. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On 2013-08-30 09:43:01 -0400, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2013-08-30 09:19:42 -0400, Stephen Frost wrote: wal_level and shared_buffers I can buy, but listen_addresses? The most typical change there is going from localhost - '*', but you've got to be on the box to do that. Anything else and you're going to need to be adding interfaces to the box anyway and hacking around in /etc/network/interfaces or what-have-you. Even if it requires to be on the box locally, I only need libpq for it. And it's not infrequent to allow additional, already configured interfaces. And even if not, what's the point of prohibiting listen_interfaces specifically? When the other very interesting variables have the same dangers? I'd like to see those dangers removed from the other very interesting variables. We're making progress towards that, for example with shared_buffers. I've commented on that already up-thread. Hell, you could even make things such that PG would start w/ a misconfigured listen_addresses- but if we don't like that then I would argue that it shouldn't be included here. Yes, we're making progress. But that's an independent thing, partially constrained by implementation complexity, partially constrained by cross platform worries. You're free to work on reducing the dangers around other variables, but I don't understand in the very least why that should stop this feature. I'm not entirely sure how wal_level has the same danger as the others.. Try setting max_wal_senders = 10 and wal_level = minimal. There is absolutely no question in my mind that this will be a huge support pain, from the first ALTER SYSTEM SET shared_buffers = blah; SHOW shared_buffers; to the why can't my database start?!? it's complaining it can't allocate memory but I keep changing postgresql.conf and nothing works! I'm simply not convinced that this is moving us forward nor that we will end up with more benefit than pain from it. That will not show the changed shared_buffers. And it (afair) will throw a WARNING that shared_buffers couldn't be adjusted at this point. Not showing the change is what I was getting at. As has been said elsewhere, throwing a warning on every interesting invokation of a command can speak to it not being exactly 'ready'. Then pg isn't ready already, because it has done that for a long time. Set some PGC_POSTMASTER variable and reload the configuration. Guiding a user that to perform required further action doesn't imply nonreadyness in the least. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Robert Haas (robertmh...@gmail.com) wrote: there's a fairly generous but fixed-at-startup-time limit on how many segments you can have. In practice I don't think this matters much, but it was a sobering reminder that the main shared memory segment, with all of its inflexibility, has important reliability properties that are hard to replicate in more dynamic scenarios. Why wouldn't it be possible to have the same arrangment for shared_buffers, where you have more entires than you 'need' at startup but which then allows you to add more shared segments later? I do see that we would need an additional bit of indirection to handle that, which might be too much overhead, but the concept seems possible. Isn't that more-or-less how the kernel handles dynamic memory..? Under the currently-proposed design, it can't be used to do any such thing. It can only be used to modify some auto.conf file which lives in $PGDATA. It's therefore no different from the ops perspective than ALTER DATABASE or ALTER USER - except that it allows all settings to be changed rather than only a subset. Claiming that modifying a file *included from a file in /etc* doesn't modify things under /etc is disingenuous, imv. Thanks, Stephen signature.asc Description: Digital signature
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Fri, Aug 30, 2013 at 9:19 AM, Stephen Frost sfr...@snowman.net wrote: You and Robert both seem to be of the opinion that this hack which brings postgresql.conf into the database via ALTER SYSTEM is a-ok because it's moving us forward in someone's mind, but it really is developing a system configuration management system which *looks* like a first-class citizen when it actually falls woefully short of that. It's not a system configuration management system, and it doesn't pretend to be. It's an analogue of ALTER USER and ALTER DATABASE that papers over their shortcomings, and a safer alternative to adminpack's kludgy way of enabling the same functionality. There is absolutely no question in my mind that this will be a huge support pain, from the first ALTER SYSTEM SET shared_buffers = blah; SHOW shared_buffers; They'd have the same problem with ALTER USER SET work_mem = blah. You're acting like ALTER commands that don't take immediate effect are something brand-new, but we've had them for years. to the why can't my database start?!? it's complaining it can't allocate memory but I keep changing postgresql.conf and nothing works! As of 9.3, that failure mode doesn't really happen any more, unless you maybe set shared_buffers equal to 100% of system memory. I'm simply not convinced that this is moving us forward nor that we will end up with more benefit than pain from it. Fair enough, but I'm not convinced that we'll derive any pain at all from it. The existing similar features haven't been a notable source of complaints AFAIK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
On 2013-08-30 06:34:47 -0700, David Johnston wrote: Tom Lane-2 wrote I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. In the example case the problem is that ORDER BY constant is a valid, if not-very-useful, construct. Can we warn on this specific usage and thus mitigate many of the potential avenues of mis-use? That doesn't help against something like »SELECT string_agg(somecol ORDER BY bar, separator)« where separator is a column. If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. I think that ship has sailed. The syntax is there and it's not going away. Requiring different syntaxes for variadic/nonvariadic usages is going to be a way much bigger pitfall for users. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Andres Freund (and...@2ndquadrant.com) wrote: Technically trivial in the sense that it should be queryable from SQL without having to write code in an untrusted PL ;). hah. I guess storing the file modification date along the file/location a GUC is originating from would be good enough. Then you could write a query using pg_stat_file() to make sure they are up2date. Wouldn't you want to actually look at the GUC and see if the value is different also? Just knowing that postgresql.conf changed doesn't mean you want every value to say it's different... It's entirely possible that the file was rewritten or touch'd but the configuration is identical. * Configuration variables only come from locations that are approved for in your scenario (Already possible, we might want to make it even easier) That an interesting notion; do you have something specific in mind..? The easiest, imv anyway, would be that options set in postgresql.conf can't be overridden, but that gets us into the bootstrap problem that people seem to be concerned about. It would also be a change to how postgresql.conf is parsed today which some people would be annoyed by. Having some configuration option which says what can be modified by alter system doesn't strike me as a terribly good solution either. I think changing the precedence of options in postgresql.conf has about zero chance. Sadly, you're probably right. That would make it possible to easily write a query that works across intallation that warns about any values stored in auto.conf, even if they are overwritten by a per-user config or similar. I had the impression that 'approved for' above meant something which actually *enforced* it rather than just another monitoring check.. Thanks, Stephen signature.asc Description: Digital signature
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Fri, Aug 30, 2013 at 9:49 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: there's a fairly generous but fixed-at-startup-time limit on how many segments you can have. In practice I don't think this matters much, but it was a sobering reminder that the main shared memory segment, with all of its inflexibility, has important reliability properties that are hard to replicate in more dynamic scenarios. Why wouldn't it be possible to have the same arrangment for shared_buffers, where you have more entires than you 'need' at startup but which then allows you to add more shared segments later? I do see that we would need an additional bit of indirection to handle that, which might be too much overhead, but the concept seems possible. Isn't that more-or-less how the kernel handles dynamic memory..? Yeah, I think that something like would be possible, but the synchronization would be tricky, and it wouldn't work at all with System V shared memory or Windows shared memory, which apparently can't be resized after creation. Also, it would rely on the system administrator having a sensible setting for max_on_the_fly_shared_buffers and only having the wrong setting for initial_shared_buffers_until_i_change_it_later, which might not cover a very high percentage of the cases that arise in practice. Under the currently-proposed design, it can't be used to do any such thing. It can only be used to modify some auto.conf file which lives in $PGDATA. It's therefore no different from the ops perspective than ALTER DATABASE or ALTER USER - except that it allows all settings to be changed rather than only a subset. Claiming that modifying a file *included from a file in /etc* doesn't modify things under /etc is disingenuous, imv. I think you're getting way too hung up on the fact that the proposed auto.conf will be stored as a flat file. From your comments upthread, I gather that you'd be rejoicing if it were a table. The only reason we're not doing that is because of the possibility that something in auto.conf might keep the server from starting. I don't think that's gonna be very common now that we're mostly out from under the System V shared memory limits, but we'll see. As you point out, it might also be worth fixing: maybe we can find some way to arrange things so that the server will always be able to start up regardless of how badly messed-up auto.conf is... but that's a different patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Robert Haas (robertmh...@gmail.com) wrote: On Fri, Aug 30, 2013 at 8:53 AM, Stephen Frost sfr...@snowman.net wrote: Yes, one of the issues with the existing system is that you can't specify a default to be applied to new roles. Also, there are parameters which are not per-role yet which it probably makes sense to be in the database and we'd need a way to deal with that. Although, at the same time, considering it role based does make for a nice distinction. Unfortunately, those clammoring for this will argue that it wouldn't replace adminpack and that they couldn't use it to modify their /etc/network/interfaces file, which is the obvious next step to all of this. This is a straw-man. It was intended to be and I wasn't particularly expecting a response to the sarcasm, at the same time.. Likening this to the feature being proposed is silly. What is being asked for here is the ability to easily modify system-wide settings from the SQL prompt, just as today you can modify settings per-user or per-database. /etc/network/interfaces would be considered part of the system by some.. :) PG isn't an OS, yet.. That's not the same thing as rewriting the entire system configuration. And here you're using the *other* meaning of system, no? Glad there won't be any confusion here! ;) (yes, I'm kidding..). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Variadic aggregates vs. project policy
David Johnston pol...@yahoo.com writes: If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. Unfortunately, that ORDER BY syntax is specified by the SQL standard, and they didn't say anything about parentheses. We don't get to require parens there. The particular case that's standardized is only array_agg(): array aggregate function ::= ARRAY_AGG left paren value expression [ ORDER BY sort specification list ] right paren but, as we customarily do, we didn't restrict the feature to be used only with that aggregate. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Andres Freund-3 wrote On 2013-08-30 06:34:47 -0700, David Johnston wrote: Tom Lane-2 wrote I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. In the example case the problem is that ORDER BY constant is a valid, if not-very-useful, construct. Can we warn on this specific usage and thus mitigate many of the potential avenues of mis-use? That doesn't help against something like »SELECT string_agg(somecol ORDER BY bar, separator)« where separator is a column. If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. I think that ship has sailed. The syntax is there and it's not going away. Requiring different syntaxes for variadic/nonvariadic usages is going to be a way much bigger pitfall for users. Neither suggestion (nor any suggestion I would imagine) is going to solve the problem. The goal is to minimize the size of the exposure. For the second ORDER BY (col1, col2) suggestion it would be added and recommended so those using that syntax would have less to worry about. This would apply to ALL invocations, not just variadic. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Variadic-aggregates-vs-project-policy-tp5768980p5769119.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
Tom, all, I'm not one to give up a fight (I hope that's something ya'll like about me ;), but in this case I'm gonna have to concede. Clearly, I'm in the minority about this, at least on the lists and among the active hackers. Let me just say that I hope all the happy users of this will outweigh the complaints. I suppose there's only one way to find out. Thanks! Stephen * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: I think you're getting way too hung up on the fact that the proposed auto.conf will be stored as a flat file. From your comments upthread, I gather that you'd be rejoicing if it were a table. I'd be happy if it was a table which managed an *independent* set of parameters from those used to bootstrap the system, but no one seems to like breaking up the options between things that can be sanely modified without other OS changes and things which require OS support. I agree with Robert's comments upthread that if the new facility can't do everything that can be done today by editing postgresql.conf, it's not going to be adequate. So I'm not in favor of having two sets of parameters. It's also not clear to me that we can make a reliable distinction between parameters that can prevent a server restart vs those that can't; or at least, the set of the latter will be much smaller than one could wish. regards, tom lane signature.asc Description: Digital signature
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Robert Haas (robertmh...@gmail.com) wrote: I think you're getting way too hung up on the fact that the proposed auto.conf will be stored as a flat file. From your comments upthread, I gather that you'd be rejoicing if it were a table. I'd be happy if it was a table which managed an *independent* set of parameters from those used to bootstrap the system, but no one seems to like breaking up the options between things that can be sanely modified without other OS changes and things which require OS support. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Variadic aggregates vs. project policy
2013/8/30 David Johnston pol...@yahoo.com Andres Freund-3 wrote On 2013-08-30 06:34:47 -0700, David Johnston wrote: Tom Lane-2 wrote I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. In the example case the problem is that ORDER BY constant is a valid, if not-very-useful, construct. Can we warn on this specific usage and thus mitigate many of the potential avenues of mis-use? That doesn't help against something like »SELECT string_agg(somecol ORDER BY bar, separator)« where separator is a column. If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. I think that ship has sailed. The syntax is there and it's not going away. Requiring different syntaxes for variadic/nonvariadic usages is going to be a way much bigger pitfall for users. Neither suggestion (nor any suggestion I would imagine) is going to solve the problem. The goal is to minimize the size of the exposure. For the second ORDER BY (col1, col2) suggestion it would be added and recommended so those using that syntax would have less to worry about. This would apply to ALL invocations, not just variadic. you cannot use this syntax - probably, because (col1, col2) is same as ROW(col1, col2) and this syntax is supported now. So there is a collision. I had a same idea, but I don't think so it is possible Regards Pavel David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Variadic-aggregates-vs-project-policy-tp5768980p5769119.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: I think you're getting way too hung up on the fact that the proposed auto.conf will be stored as a flat file. From your comments upthread, I gather that you'd be rejoicing if it were a table. I'd be happy if it was a table which managed an *independent* set of parameters from those used to bootstrap the system, but no one seems to like breaking up the options between things that can be sanely modified without other OS changes and things which require OS support. I agree with Robert's comments upthread that if the new facility can't do everything that can be done today by editing postgresql.conf, it's not going to be adequate. So I'm not in favor of having two sets of parameters. It's also not clear to me that we can make a reliable distinction between parameters that can prevent a server restart vs those that can't; or at least, the set of the latter will be much smaller than one could wish. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add database to PGXACT / per database vacuuming
Hi, For the logical decoding patch I added support for pegging RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing undue bloat cpu overhead (hot pruning is friggin expensive) I split RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where the latter is the the xmin horizon used for non-shared, non-catalog tables. That removed almost all overhead I could measure. During that I was tinkering with the idea of reusing that split to vacuum/prune user tables in a per db fashion. In a very quick and hacky test that sped up the aggregate performance of concurrent pgbenches in different databases by about 30%. So, somewhat worthwile ;). The problem with that is that GetSnapshotData, which computes RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC which contains the database oid. This is a recently added optimization which made GetSnapshotData() quite a bit faster scalable which is important given the frequency it's called at. What about moving/copying the database oid from PGPROC to PGXACT? Currently a single PGXACT is 12 bytes which means we a) have several entries in a single cacheline b) have ugly sharing because we will have PGXACTs split over more than one cacheline. Comments? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v5
Hi, I've attached a couple of the preliminary patches to $subject which I've recently cleaned up in the hope that we can continue improving on those in a piecemal fashion. I am preparing submission of a newer version of the major patch but unfortunately progress on that is slower than I'd like... In the order of chance of applying them individuall they are: 0005 wal_decoding: Log xl_running_xact's at a higher frequency than checkpoints are done * benefits hot standby startup 0003 wal_decoding: Allow walsender's to connect to a specific database * biggest problem is how to specify the connection we connect to. Currently with the patch walsender connects to a database if it's not named replication (via dbname). Perhaps it's better to invent a replication_dbname parameter? 0006 wal_decoding: copydir: move fsync_fname to fd.[c.h] and make it public * Pretty trivial and boring. 0007 wal_decoding: Add information about a tables primary key to struct RelationData * Could be used in the matview refresh code 0002 wal_decoding: Introduce InvalidCommandId and declare that to be the new maximum for CommandCounterIncrement Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 3442c3a4e44c5a64efbe651b745a6f86f69cfdab Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Mon, 19 Aug 2013 13:24:30 +0200 Subject: [PATCH 02/13] wal_decoding: Introduce InvalidCommandId and declare that to be the new maximum for CommandCounterIncrement This is useful to be able to represent a CommandId thats invalid. There was no such value before. This decreases the possible number of subtransactions by one which seems unproblematic. Its also not a problem for pg_upgrade because cmin/cmax are never looked at outside the context of their own transaction (spare timetravel access, but thats new anyway). --- src/backend/access/transam/xact.c | 4 ++-- src/include/c.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 31e868d..0591f3f 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -766,12 +766,12 @@ CommandCounterIncrement(void) if (currentCommandIdUsed) { currentCommandId += 1; - if (currentCommandId == FirstCommandId) /* check for overflow */ + if (currentCommandId == InvalidCommandId) { currentCommandId -= 1; ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), - errmsg(cannot have more than 2^32-1 commands in a transaction))); + errmsg(cannot have more than 2^32-2 commands in a transaction))); } currentCommandIdUsed = false; diff --git a/src/include/c.h b/src/include/c.h index 5961183..14bfdcd 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -368,6 +368,7 @@ typedef uint32 MultiXactOffset; typedef uint32 CommandId; #define FirstCommandId ((CommandId) 0) +#define InvalidCommandId (~(CommandId)0) /* * Array indexing support -- 1.8.3.251.g1462b67 From ac48fc2f5c5f0031494cfabb0bca46f0bbef47d2 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Mon, 19 Aug 2013 13:24:30 +0200 Subject: [PATCH 03/13] wal_decoding: Allow walsender's to connect to a specific database Currently the decision whether to connect to a database or not is made by checking whether the passed dbname parameter is replication. Unfortunately this makes it impossible to connect a to a database named replication... Possibly it would be better to use a separate connection parameter like replication_dbname=xxx? This is useful for future walsender commands which need database interaction. --- doc/src/sgml/protocol.sgml | 5 +++- src/backend/postmaster/postmaster.c| 13 +-- .../libpqwalreceiver/libpqwalreceiver.c| 4 ++-- src/backend/replication/walsender.c| 27 ++ src/backend/utils/init/postinit.c | 5 src/bin/pg_basebackup/pg_basebackup.c | 4 ++-- src/bin/pg_basebackup/pg_receivexlog.c | 4 ++-- src/bin/pg_basebackup/receivelog.c | 4 ++-- 8 files changed, 51 insertions(+), 15 deletions(-) diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 0b2e60e..51b4435 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1304,7 +1304,10 @@ To initiate streaming replication, the frontend sends the literalreplication/ parameter in the startup message. This tells the backend to go into walsender mode, wherein a small set of replication commands can be issued instead of SQL statements. Only the simple query protocol can be -used in walsender mode. +used in walsender mode. A literaldbname/ of literalreplication/ will +start a walsender not connected to any database, specifying any other database +will connect to
Re: [HACKERS] dynamic shared memory
Hi, On 2013-08-28 15:20:57 -0400, Robert Haas wrote: That way any corruption in that area will prevent restarts without reboot unless you use ipcrm, or such, right? The way I've designed it, no. If what we expect to be the control segment doesn't exist or doesn't conform to our expectations, we just assume that it's not really the control segment after all - e.g. someone rebooted, clearing all the segments, and then an unrelated process (malicious, perhaps, or just a completely different cluster) reused the same name. This is similar to what we do for the main shared memory segment. The case I am mostly wondering about is some process crashing and overwriting random memory. We need to be pretty sure that we'll never fail partially through cleaning up old segments because they are corrupted or because we died halfway through our last cleanup attempt. I think we want that during development, but I'd rather not go there when releasing. After all, we don't support a manual choice between anonymous mmap/sysv shmem either. That's true, but that decision has not been uncontroversial - e.g. the NetBSD guys don't like it, because they have a big performance difference between those two types of memory. We have to balance the possible harm of one more setting against the benefit of letting people do what they want without needing to recompile or modify code. But then, it made them fix the issue afaik :P In addition, I've included an implementation based on mmap of a plain file. As compared with a true shared memory implementation, this obviously has the disadvantage that the OS may be more likely to decide to write back dirty pages to disk, which could hurt performance. However, I believe it's worthy of inclusion all the same, because there are a variety of situations in which it might be more convenient than one of the other implementations. One is debugging. Hm. Not sure what's the advantage over a corefile here. You can look at it while the server's running. That's what debuggers are for. On MacOS X, for example, there seems to be no way to list POSIX shared memory segments, and no easy way to inspect the contents of either POSIX or System V shared memory segments. Shouldn't we ourselves know which segments are around? Sure, that's the point of the control segment. But listing a directory is a lot easier than figuring out what the current control segment contents are. But without a good amount of tooling - like in a debugger... - it's not very interesting to look at those files either way? The mere presence of a segment doesn't tell you much and the contents won't be easily readable. Another use case is working around an administrator-imposed or OS-imposed shared memory limit. If you're not allowed to allocate shared memory, but you are allowed to create files, then this implementation will let you use whatever facilities we build on top of dynamic shared memory anyway. I don't think we should try to work around limits like that. I do. There's probably someone, somewhere in the world who thinks that operating system shared memory limits are a good idea, but I have not met any such person. Let's drive users away from sysv shem is the only one I heard so far ;) I would never advocate deliberately trying to circumvent a carefully-considered OS-level policy decision about resource utilization, but I don't think that's the dynamic here. I think if we insist on predetermining the dynamic shared memory implementation based on the OS, we'll just be inconveniencing people needlessly, or flat-out making things not work. [...] But using file-backed memory will *suck* performancewise. Why should we ever want to offer that to a user? That's what I was arguing about primarily. If we're SURE that a Linux user will prefer posix to sysv or mmap or none in 100% of cases, and that a NetBSD user will always prefer sysv over mmap or none in 100% of cases, then, OK, sure, let's bake it in. But I'm not that sure. I think posix shmem will be preferred to sysv shmem if present, in just about any relevant case. I don't know of any system with lower limits on posix shmem than on sysv. I think this case is roughly similar to wal_sync_method: there really shouldn't be a performance or reliability difference between the ~6 ways of flushing a file to disk, but as it turns out, there is, so we have an option. Well, most of them actually give different guarantees, so it makes sense to have differing performance... Why do we want to expose something unreliable as preferred_address to the external interface? I haven't read the code yet, so I might be missing something here. I shared your opinion that preferred_address is never going to be reliable, although FWIW Noah thinks it can be made reliable with a large-enough hammer. I think we need to have the arguments for that on list then. Those are pretty damn fundamental
Re: [HACKERS] Add database to PGXACT / per database vacuuming
On 30.08.2013 19:01, Andres Freund wrote: For the logical decoding patch I added support for pegging RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing undue bloat cpu overhead (hot pruning is friggin expensive) I split RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where the latter is the the xmin horizon used for non-shared, non-catalog tables. That removed almost all overhead I could measure. During that I was tinkering with the idea of reusing that split to vacuum/prune user tables in a per db fashion. In a very quick and hacky test that sped up the aggregate performance of concurrent pgbenches in different databases by about 30%. So, somewhat worthwile ;). The problem with that is that GetSnapshotData, which computes RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC which contains the database oid. This is a recently added optimization which made GetSnapshotData() quite a bit faster scalable which is important given the frequency it's called at. Hmm, so you're creating a version of GetSnapshotData() that only takes into account backends in the same backend? What about moving/copying the database oid from PGPROC to PGXACT? Might be worthwhile. Currently a single PGXACT is 12 bytes which means we a) have several entries in a single cacheline b) have ugly sharing because we will have PGXACTs split over more than one cacheline. I can't get excited about either of these arguments, though. The reason for having separate PGXACT structs is that they are as small as possible, so that you can fit as many of them as possible in as few cache lines as possible. Whether one PGXACT crosses a cache line or not is not important, because when taking a snapshot, you scan through all of them. I don't know how big an impact adding the database oid would have, on the case that the PGPROC/PGXACT split was done in the first place. In the worst case it will make taking a snapshot 1/3 slower under contention. That needs to be tested. One idea is to have a separate PGXACT array for each database? Well, that might be difficult, but something similar, like group all PGXACTs for one database together, and keep a separate lookup array for where the entries for each database begins. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add database to PGXACT / per database vacuuming
On 30.08.2013 21:07, Heikki Linnakangas wrote: On 30.08.2013 19:01, Andres Freund wrote: For the logical decoding patch I added support for pegging RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing undue bloat cpu overhead (hot pruning is friggin expensive) I split RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where the latter is the the xmin horizon used for non-shared, non-catalog tables. That removed almost all overhead I could measure. During that I was tinkering with the idea of reusing that split to vacuum/prune user tables in a per db fashion. In a very quick and hacky test that sped up the aggregate performance of concurrent pgbenches in different databases by about 30%. So, somewhat worthwile ;). The problem with that is that GetSnapshotData, which computes RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC which contains the database oid. This is a recently added optimization which made GetSnapshotData() quite a bit faster scalable which is important given the frequency it's called at. Hmm, so you're creating a version of GetSnapshotData() that only takes into account backends in the same backend? I mean, only takes account backends in the same database? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
On 08/30/2013 03:05 AM, Kohei KaiGai wrote: Surely someone in the security community has discussed this? Security community considers covert channel is a hard to solve problem; nearly impossible to eliminate. Let's see the summary in wikipedia: http://en.wikipedia.org/wiki/Covert_channel Well, in each of the cases covered in that article, the given technology (OSI, TCP, etc.) takes specific provisions to limit the ability of attackers to discover information via the covert channel. However, we have yet to talk about taking any such provisions with Postgres. If we commit this patch, arguably we'll have a row-level security feature which only protects data from well-behaved users, which seems counterproductive. So, arguments in favor of this patch: a) it's as good as Oracle's security features, giving us check-box compliance. b) it allows securing individual rows against attackers with limited technical knowledge or limited database access, and could be very hardened in combination with intelligent access control. c) it is an improvement on techniques like Veil (is it)? d) we plan to continue improving it and closing covert channels, or limiting their bandwidth. Arguments against: m) covert channels are currently broad enough to make it trivially circumventable (are they?) n) overhead and code maintenance required is substantial So, determinative questions: 1) are the security mechanisms supplied by this patch superior in some way to approaches like Veil for multi-tenant applications? (this is a serious question, as multi-tenant applications are far less concerned about covert channels) 2) do we plan to reduce the accessibility of data via covert channels over successive releases? How? 3) will accepting this patch allow our users in the Government space to more freely adopt PostgreSQL? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add database to PGXACT / per database vacuuming
On 2013-08-30 21:07:04 +0300, Heikki Linnakangas wrote: On 30.08.2013 19:01, Andres Freund wrote: For the logical decoding patch I added support for pegging RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing undue bloat cpu overhead (hot pruning is friggin expensive) I split RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where the latter is the the xmin horizon used for non-shared, non-catalog tables. That removed almost all overhead I could measure. During that I was tinkering with the idea of reusing that split to vacuum/prune user tables in a per db fashion. In a very quick and hacky test that sped up the aggregate performance of concurrent pgbenches in different databases by about 30%. So, somewhat worthwile ;). The problem with that is that GetSnapshotData, which computes RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC which contains the database oid. This is a recently added optimization which made GetSnapshotData() quite a bit faster scalable which is important given the frequency it's called at. Hmm, so you're creating a version of GetSnapshotData() that only takes into account backends in the same backend? You can see what I did for logical decoding in http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blobdiff;f=src/backend/storage/ipc/procarray.c;h=11aa1f5a71196a61e31b711e0a044b2a5927a6cc;hp=9bf0989c9206b5e07053587f517d5e9a2322a628;hb=edcf0939072ebe68969560a7d54a26c123b279b4;hpb=ff4fa81665798642719c11c779d0518ef6611373 So, basically I compute the normal RecentGlobalXmin, and then just subtract the logical xmin which is computed elsewhere to get the catalog xmin. What I'd done with the prototype of $topic (lost it, but I am going to hack it together again) was just to compute RecentGlobalXmin (for non-catalog, non-shared tables) at the same time with RecentGlobalDataXmin (for everything else) by just not lowering RecentGlobalDataXmin if pgxact-dboid != MyDatabaseId. So, the snapshot itself was the same, but because RecentGlobalDataXmin is independent from the other databases vacuum pruning can cleanup way more leading to a smaller database and higher database. Currently a single PGXACT is 12 bytes which means we a) have several entries in a single cacheline b) have ugly sharing because we will have PGXACTs split over more than one cacheline. I can't get excited about either of these arguments, though. The reason for having separate PGXACT structs is that they are as small as possible, so that you can fit as many of them as possible in as few cache lines as possible. Whether one PGXACT crosses a cache line or not is not important, because when taking a snapshot, you scan through all of them. The problem with that is that we actually write to PGXACT pretty frequently (at least -xid, -xmin, -nxids, -delayChkpt). As soon as you factor that in, sharing cachelines between backends can hurt. Even a plain GetSnapshotData() will write to MyPgXact-xmin... I don't know how big an impact adding the database oid would have, on the case that the PGPROC/PGXACT split was done in the first place. In the worst case it will make taking a snapshot 1/3 slower under contention. That needs to be tested. Yes, definitely. I am basically wondering whether somebody has/sees fundamental probles with it making it pointless to investigate. One idea is to have a separate PGXACT array for each database? Well, that might be difficult, but something similar, like group all PGXACTs for one database together, and keep a separate lookup array for where the entries for each database begins. Given that we will have to search all PGXACT entries anyway because of shared relations for the forseeable future, I can't see that being really beneficial. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freezing without write I/O
Hi Heikki, On 2013-08-27 18:56:15 +0300, Heikki Linnakangas wrote: Here's an updated patch. The race conditions I mentioned above have been fixed. Thanks for posting the new version! I have a quick question: The reason I'd asked about the status of the patch was that I was thinking about the state of the forensic freezing patch. After a quick look at your proposal, we still need to freeze in some situations (old new data on the same page basically), so I'd say it still makes sense to apply the forensic freezing patch, right? Differing Opinions? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ENABLE/DISABLE CONSTRAINT NAME
Uh ... why not just drop the constraint, and re-add it later if you want it again? My 0.02€ : maybe because you must keep track of the constraint details to do so, this it is significantly more error prone than disable / enable when the bookkeeping is done by the system and if everything is in a transaction... If the ENABLE is automatically done on the next COMMIT, that would be even better. This seems like adding a lot of mechanism (and possible bugs) for a rather marginal use-case. That is possible! -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
Josh, * Josh Berkus (j...@agliodbs.com) wrote: On 08/30/2013 03:05 AM, Kohei KaiGai wrote: Security community considers covert channel is a hard to solve problem; nearly impossible to eliminate. While impossible to eliminate, we should certainly consider cases like this where we can do better and fix them. RLS certainly brings another level of consideration to the overall PG security environment by requiring we think about security on a row level rather than just a table or column level. We have issues with covert channels even without RLS though and holding up RLS because it doesn't fix all the covert channels isn't sensible. Column-level privleges have a similar problem, where you can read the default value for a column using the catalogs. Perhaps the default isn't sensetive (you'd certainly hope not), but it's still an issue. It wouldn't surprise me to find that there are ways to abuse a multi-column index which includes both a column you can manipulate and one you don't have access to read to determine something about the hidden column (maybe you have access to the 2nd field in the index and you can encourage an in-order index traversal and then look at filtered rows, or just work out a way to do timing attacks to determine the btree depth). Well, in each of the cases covered in that article, the given technology (OSI, TCP, etc.) takes specific provisions to limit the ability of attackers to discover information via the covert channel. The work we've done around secure views would lend credit to our attempts at taking specific provisions as well; sadly, PG is slightly more complicated than TCP. We do what we can and we've got a great community which will point out where we can do better- and we work on it and improve over time. Hell, when roles were first added we had a *massive* security hole because we didn't check to make sure we weren't overrunning the length of the GUC. It was a mistake and we should have done better, but that doesn't mean adding roles was the wrong decision. However, we have yet to talk about taking any such provisions with Postgres. If we commit this patch, arguably we'll have a row-level security feature which only protects data from well-behaved users, which seems counterproductive. I would argue both that we *have* been taking provisions to avoid obvious and big covert channels, and that this patch adds value even if it doesn't protect the system perfectly from malicious users. We're all certainly aware of the ability for an attacker to cause major problems to a PG system if they can issue arbitrary SQL and our permissions system doesn't do much to protect us. A single query which doesn't require any privileges could cause havok on the system (massive on-disk temp file, which could be shared with pg_xlog causing the system to PANIC, massive CPU load if they can execute multiple commands in parallel...). Not to mention the default installation of pl/pgsql and anonymous functions. I could see many a web app (things like LedgerSMB) which could benefit from having more fine-grained in-database control because they already authenticate to the database as the user and have a static or at least controlled set of queries which they run. Today, any of those kinds of systems have to implement their own RLS (though sometimes it's done through independent tables for each customer or similar, rather than as conditionals added to queries). a) it's as good as Oracle's security features, giving us check-box compliance. I'd argue that this is definitely much more than 'check-box' compliance. b) it allows securing individual rows against attackers with limited technical knowledge or limited database access, and could be very hardened in combination with intelligent access control. c) it is an improvement on techniques like Veil (is it)? d) we plan to continue improving it and closing covert channels, or limiting their bandwidth. Arguments against: m) covert channels are currently broad enough to make it trivially circumventable (are they?) There are some which are and likely deserve to be fixed. Do they all need to be addressed before this patch goes in? I'd argue 'no'. n) overhead and code maintenance required is substantial So, determinative questions: 1) are the security mechanisms supplied by this patch superior in some way to approaches like Veil for multi-tenant applications? (this is a serious question, as multi-tenant applications are far less concerned about covert channels) I'd argue 'yes' if just for the fact that it'd be simpler and easier to use, both because it's in core and because you're using an actual grammar instead of function calls- but this RLS does more than just that, it's going to cause us to improve things that Veil probably can't fix and simply ignores today. 2) do we plan to reduce the accessibility of data via covert channels over successive releases? How? By discovering them and fixing them as we
Re: [HACKERS] [v9.4] row level security
Stephen Frost sfr...@snowman.net writes: We have issues with covert channels even without RLS though and holding up RLS because it doesn't fix all the covert channels isn't sensible. I think it's entirely sensible to question whether we should reject (not hold up) RLS if it has major covert-channel problems. The scenario I'm worried about is where somebody says hey, Postgres has RLS now, I can rely on that to hide my sooper sekrit data from other users in the same database, and later they have a security breach through some covert-channel attack. Are they going to blame themselves? No, they're gonna blame Postgres. Or consider the case where some bozo publishes a method for such an attack and uses it to badmouth us as insecure. I don't think we need the headaches that will result from promising (or at least appearing to promise) something we can't deliver. Nor am I convinced that we're really doing users any favors by providing such a feature. They'd be *far* better advised to put their critical data in a separate database. In short, we can check some check-box is a really, really bad reason to accept a security-related feature. If we're going to put up with all the downsides of RLS, I want the end result to be something that's actually secure, not something that gives the illusion of security. And right now, I do not believe we can get past the illusion stage, ever (certainly not in a release or two). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
On 08/30/2013 12:43 PM, Tom Lane wrote: In short, we can check some check-box is a really, really bad reason to accept a security-related feature. If we're going to put up with all the downsides of RLS, I want the end result to be something that's actually secure, not something that gives the illusion of security. And right now, I do not believe we can get past the illusion stage, ever (certainly not in a release or two). Can you be more explicit about all the downsides of RLS? I was just looking over the patch, which is less than 5000 lines. While it's not small, we have larger patches in the CF. So what's the specific downsides of this? The reason I brought up multi-tenant applications (MTA), BTW, is that this is the other major potential utility of RLS, and for such users the covert channel limitations are acceptable (as long as we publish them). That is, for a thing-as-a-service application, users are not assumed to have unlimited access to the SQL command line anyway; RLS is employed just to limit the damage they can do if they get access, and to limit the disclosure if some app programmer makes a mistake. Right now, the primary tool for doing row filtering for MTA is Veil, which has numerous and well-known limitations. If RLS has fewer limitations, or is easier to deploy, maintain, and/or understand, then it's a valuable feature for that user base, even if it doesn't address the covert channels we've brought up at all. That is, if RLS is your *second* level of defense, instead of your primary defense, covert channels are not a make-or-break issue. It just has to be better than what we had before. Note that I have NOT done an evaluation of Veil vs. RLS for MTA at this point. I'm hoping someone else will ;-) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
Josh Berkus j...@agliodbs.com writes: On 08/30/2013 12:43 PM, Tom Lane wrote: In short, we can check some check-box is a really, really bad reason to accept a security-related feature. If we're going to put up with all the downsides of RLS, I want the end result to be something that's actually secure, not something that gives the illusion of security. Can you be more explicit about all the downsides of RLS? I was just looking over the patch, which is less than 5000 lines. While it's not small, we have larger patches in the CF. So what's the specific downsides of this? I think it's going to be an ongoing maintenance headache and an endless source of security bugs, even disregarding covert-channel issues. I have pretty much zero faith in the planner changes, in particular, and would still not have a lot if they were adequately documented, which they absolutely are not. The whole thing depends on nowhere-clearly-stated assumptions that plan-time transformations will never allow an RLS check to be bypassed. I foresee future planner work breaking this in non-obvious ways on a regular basis (even granting the assumption that it's bulletproof today, which is at best unproven). The reason I brought up multi-tenant applications (MTA), BTW, is that this is the other major potential utility of RLS, and for such users the covert channel limitations are acceptable (as long as we publish them). [ shrug... ] You might've noticed I work for a multi-tenant shop now. I'm still not excited about this. That is, if RLS is your *second* level of defense, instead of your primary defense, covert channels are not a make-or-break issue. It just has to be better than what we had before. Yeah, that's a fair point. I'm just not convinced that it's enough better to justify the maintenance burden we'll be taking on. I'm not thrilled about the every bug is a security bug angle, either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What happens at BIND time? (pg_upgrade issue)
On 08/28/2013 11:44 AM, Josh Berkus wrote: Tom, Does the backend's memory usage climb, or hold steady? If the former, I'd bet on client failure to release resources, eg not closing the portals when done with them. A memory map from MemoryContextStats would help determine exactly what's leaking. FS cache usage increases through the test run, as you'd expect, but the amount of pinned memory actually remains pretty much constant -- and has the same usage in both 8.4 (where the BIND issue doesn't happen) and 9.3b2 (where it does). So, this just got a lot stranger. What we've been testing here is upgrading from 8.4 to 9.X via pg_upgrade, because that's what they have to do in production for time reasons. We recently confirmed that this issue affects 9.0 also. So I finally got results back from the test where we do dump/restore (to 9.3b2) instead of pg_upgrade ... and the BIND regression does not occur. So this now has something to do with pg_upgrade, not just BIND and Java. cc'd Bruce for that reason. We'll be rerunning these tests next week, comparing a 9.3 with the issue to a 9.3 without it under oprofile etc. Suggestions on what we should look for are welcome. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: We have issues with covert channels even without RLS though and holding up RLS because it doesn't fix all the covert channels isn't sensible. I think it's entirely sensible to question whether we should reject (not hold up) RLS if it has major covert-channel problems. Rejecting RLS because we've suddently realized that covert channels exist is foolishness. It's akin to rejecting the ability to add stored procedures because we don't protect prosrc from people who don't own or can't execute the function. The scenario I'm worried about is where somebody says hey, Postgres has RLS now, I can rely on that to hide my sooper sekrit data from other users in the same database, and later they have a security breach through some covert-channel attack. Are they going to blame themselves? No, they're gonna blame Postgres. Or consider the case where some bozo publishes a method for such an attack and uses it to badmouth us as insecure. In my experience, issues are discovered, patched, and released as security updates. Does adding RLS mean we might have more of those? Sure, as did column level privileges and as does *any* such increase in the granularity of what we can provide security-wise. I don't think we need the headaches that will result from promising (or at least appearing to promise) something we can't deliver. Nor am I convinced that we're really doing users any favors by providing such a feature. They'd be *far* better advised to put their critical data in a separate database. We've barely got cross-database queries with FDWs. The notion that adding such complexity into those as RLS, which each individual user will need to figure out how to do themselves and most will likely get far wrong and much worse than what we'd implement, is better for our users is just ridiculous. In short, we can check some check-box is a really, really bad reason to accept a security-related feature. If we're going to put up with all the downsides of RLS, I want the end result to be something that's actually secure, not something that gives the illusion of security. And right now, I do not believe we can get past the illusion stage, ever (certainly not in a release or two). I'm not argueing for this because it fulfills some check-box; the question about if it would help a given set of clients (ones which I no longer have any direct relationship with, as it turns out) adopt PG was asked and I answered it as best I could. I certainly think we need to get past the 'illusion' stage also. I'm certainly more optimistic about that than you are but I also understand it's not going to be perfect in the first release- but I do think it'll be better than the 'illusion' stage. It'll get there because we'll continue to discuss it, people will test it, etc; as one hopes happens with all new features, but this even more than others. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Window functions can be created with defaults, but they don't work
I noticed this while poking at the variadic-aggregates issue: regression=# create function nth_value_def(anyelement, integer = 1) returns anyelement language internal window immutable strict as 'window_nth_value'; CREATE FUNCTION regression=# SELECT nth_value_def(ten) OVER (PARTITION BY four), ten, four FROM (SELECT * FROM tenk1 WHERE unique2 10 ORDER BY four, ten)s; The connection to the server was lost. Attempting reset: Failed. The reason this crashes is that the planner doesn't apply default-insertion to WindowFunc nodes, only to FuncExprs. We could make it do that, probably, but that seems to me like a feature addition. I think a more reasonable approach for back-patching is to fix function creation to disallow attaching defaults to window functions (or aggregates, for that matter, which would have the same issue if CREATE AGGREGATE had the syntax option to specify defaults). ProcedureCreate seems like an appropriate place, since it already contains a lot of sanity checks of this sort. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] INSERT...ON DUPLICATE KEY IGNORE
For many years now, MySQL has a feature called INSERT IGNORE [1]; SQLite has INSERT ON CONFLICT IGNORE [2]; SQL Server has an option called IGNORE_DUP_KEY and Oracle has a hint called IGNORE_ROW_ON_DUPKEY_INDEX (they acknowledge that it's a bit odd that a hint changes the semantics of a DML statement - of course, MERGE has always been able to do something approximately equivalent). Each of these features has their respective systems simply not insert certain tuples that will result in a duplicate key violation, while potentially proceeding with the insertion of other, non-violating tuples. The attached WIP patch implements this for Postgres, with a few notable differences: 1) The patch is only interested in unique index violations (specifically, violations of amcanunique access method unique indexes); it will not do anything with NULL constraint violations, as the MySQL feature does, for example. It also doesn't care about exclusion constraints, even though it's currently possible to defer exclusion constraint enforcement in a way that is analogous to how it's done for btree indexes. 2) The clause that I've proposed is, as you'll have already gathered, spelt slightly differently to any of these systems. I'm not particularly attached to how I've spelt it. I've spelt it this way so as to avoid suggesting that it's fully compatible with the MySQL feature. I don't think we want weird NULLness behavior, but I may be mistaken. If we want that additional behavior, we probably want it as an optional adjunct, or perhaps an entirely independent clause to what I've specified here. 3) RETURNING is expanded - RETURNING REJECTS * is now possible where that makes sense. It is possible for clients to interrogate the wire protocol to see the number of rows affected (this can be done with a PQcmdTuples() call by libpq clients, for example), and see how that compares with the number of tuples they tried to insert. In addition, a client can use RETURNING to see what rows were successfully inserted. Here is a simple example session that shows this usage: postgres=# create table tab(id int4 primary key, val text); CREATE TABLE postgres=# insert into tab(id, val) values(1, 'val'), (3, 'val'), (4, 'val'); INSERT 0 3 postgres=# insert into tab(id, val) values(1, 'nosert'), (2, 'nosert'), (3, 'nosert') on duplicate key ignore returning id; id 2 (1 row) INSERT 0 1 postgres=# select xmin, * from tab; xmin | id | val ++- 580843 | 1 | val 580843 | 3 | val 580843 | 4 | val 580844 | 2 | nosert (4 rows) If this all happened in two separate, concurrent sessions, the transaction with xid 580844 might well have blocked on the outcome of 580843 in the usual way (ultimately inserting or doing nothing as appropriate for each tuple), just as you'd expect. Note, however, that the xid of the noserter is only seen here for the tuple that it actually successfully inserted - in effect, the noserter session does not lock tuples that it did not originally insert. That's consistent with MySQL's INSERT IGNORE, for example. However, some people might prefer it if this did share lock rows that prevented tuples from being inserted, or if that was at least an option. That would be something quite a bit closer to a fully-fledged upsert - I imagine we'd have to lock the old tuple using EvalPlanQual, and if a submission did that correctly then I think we'd be well on our way to solving all the problems. RETURNING REJECTS is a way of getting the inverse of affected tuples of an ordinary INSERT...RETURNING statement - rather than returning rows successfully inserted only (i.e. tuples where either a BEFORE trigger returned NULL or we didn't proceed with insertion), we returning rows that were not successfully inserted only. This is only meaningful in the context of INSERT...ON DUPLICATE KEY IGNORE. Sure, clients could figure this out for themselves using vanilla RETURNING *, but I think that this addition is justified by the fact that it's generally expected that the number of rejects will be relatively small. People are going to want to use this feature in an ad-hoc manner within a psql session, and supporting this will particularly help there. It might be nice to find a way to indicate the exact constraint violated per row returned, but I preferred to wait and see if people thought it was a good idea generally. Use-cases = The use-cases for this feature are fairly obvious. It can be used as a way of merging tuples easily, if clients can live with the present limitations. In addition to that, the feature is may be of value to clients of the logical change-set generation infrastructure (such as the proposed BDR contrib module[3]), if and when that infrastructure is finally committed. Currently, the BDR prototype simply balks at INSERT conflicts. This is because the amount of subtransactions/xids generated to do anything else is considered prohibitive. If Postgres is ever going to solve the kinds of problems
Re: [HACKERS] Window functions can be created with defaults, but they don't work
On Fri, Aug 30, 2013 at 6:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: I noticed this while poking at the variadic-aggregates issue: regression=# create function nth_value_def(anyelement, integer = 1) returns anyelement language internal window immutable strict as 'window_nth_value'; CREATE FUNCTION regression=# SELECT nth_value_def(ten) OVER (PARTITION BY four), ten, four FROM (SELECT * FROM tenk1 WHERE unique2 10 ORDER BY four, ten)s; The connection to the server was lost. Attempting reset: Failed. The reason this crashes is that the planner doesn't apply default-insertion to WindowFunc nodes, only to FuncExprs. We could make it do that, probably, but that seems to me like a feature addition. I think a more reasonable approach for back-patching is to fix function creation to disallow attaching defaults to window functions (or aggregates, for that matter, which would have the same issue if CREATE AGGREGATE had the syntax option to specify defaults). ProcedureCreate seems like an appropriate place, since it already contains a lot of sanity checks of this sort. I'm not sure I agree. Under that approach, any functions that have already been created like that will still crash the server. A malicious user could create a function like this now and wait to crontab it until the day he's leaving the company. Or there are more accidental scenarios as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Aug 29, 2013 at 10:55 PM, Fujii Masao masao.fu...@gmail.com wrote: Attached patch adds new GUC parameter 'compress_backup_block'. I think this is a great idea. (This is not to disagree with any of the suggestions made on this thread for further investigation, all of which I think I basically agree with. But I just wanted to voice general support for the general idea, regardless of what specifically we end up with.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] operator precedence issues
Hackers, The operator precedence rules seem hard wired to not be able to be worked around, not matter what. The pain point for me has always been the division operator -- once in a while I end up in a situation where I want to override it so that it wraps the divisor with NULLIF. There is no way I can see to do that: custom operator (for example '//') names evaluate in different precedence order which is a non-starter essentially. That I'm ok with given the reasoning in the docs, but I'm really scratching my head over this rule (via http://www.postgresql.org/docs/9.3/static/sql-syntax-lexical.html#SQL-PRECEDENCE): When a schema-qualified operator name is used in the OPERATOR syntax, as for example in: SELECT 3 OPERATOR(pg_catalog.+) 4; the OPERATOR construct is taken to have the default precedence shown in Table 4-2 for any other operator. This is true no matter which specific operator appears inside OPERATOR(). That rule seems intentionally designed to make it impossible to to override mathematical behaviors. Mainly curious -- was that intentional? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions can be created with defaults, but they don't work
Robert Haas robertmh...@gmail.com writes: On Fri, Aug 30, 2013 at 6:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: The reason this crashes is that the planner doesn't apply default-insertion to WindowFunc nodes, only to FuncExprs. I'm not sure I agree. Under that approach, any functions that have already been created like that will still crash the server. A malicious user could create a function like this now and wait to crontab it until the day he's leaving the company. Or there are more accidental scenarios as well. The crash is only possible because the underlying internal-language function doesn't sanity-check its input enough to catch the case of too few arguments. As such, it's not that different from hundreds of other cases where a superuser can cause a crash by misdeclaring the arguments to an internal-language function. So I don't find your argument compelling. I'd even say this was user error, except that it's not obvious that this case shouldn't work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY IGNORE
On 08/30/2013 03:09 PM, Peter Geoghegan wrote: The attached WIP patch implements this for Postgres, with a few notable differences: Thank you for addressing this. If nobody is going to hack out MERGE, we could really use at least this feature. 3) RETURNING is expanded - RETURNING REJECTS * is now possible where that makes sense. Oh, nifty! OK, now I can *really* use this feature. This patch is a spin-off from a broader effort to implement INSERT...ON DUPLICATE KEY UPDATE (upsert). During the 2012 developer Yeah, I was wondering when we'd get to that. Obviously there will be users clamoring for it ... Unlike some other systems like DB2, we have always allowed BEFORE ROW triggers to execute arbitrary SQL. I've frequently thought this was a bit of a wart (e.g. [10]), and certainly not supportive of sensible, idiomatic trigger use, but there isn't much we can do about it at this stage. Right now, BEFORE ROW triggers will still fire when the new code decides to not do the insert. It certainly wouldn't be acceptable to allow before triggers to run *after* the first phase of speculative insertion, because they might try and access an index with an exclusive locked buffer, resulting in backend self-deadlock. Besides, we cannot really know *what* to lock until after the before triggers fire. That leaves us with two options, as far as I can tell: 1. Just have users live with those semantics. This is what the patch does presently, and anyone who is using triggers in what I consider to be a sensible way doesn't have to care. For everyone else, it's a gotcha that they have to deal with, to be noted prominently. +1. We already allow BEFORE triggers to violate referential integrity. I don't think that allowing them to behave oddly around INSERT ... IGNORE is a problem compared to that. We just need to document it, so that users know that their BEFORE code will fire even if the INSERT is being ignored, and that a BEFORE trigger can cause an INSERT ... IGNORE to error out. I have done no performance testing to date. Reviewers will want to pay attention to the performance implications, particularly in the regular insert case; it's conceivable that I've regressed things, though I don't specifically suspect that I have. Yeah, we'll also want to document the performance overhead for the bulk loading case. I know I'll want to use this syntax as a primitive form of MERGE, and I'll want to know what it costs me. Does this work with multiple VALUES rows? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY IGNORE
On Fri, Aug 30, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote: Does this work with multiple VALUES rows? Yes. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator precedence issues
Merlin Moncure mmonc...@gmail.com writes: The operator precedence rules seem hard wired to not be able to be worked around, not matter what. That's right. In the first place, bison is incapable of doing anything other than hard-wired operator precedence. In the second, if we did try to allow catalog-driven precedence, it would require catalog lookups during the raw parser phase, which isn't going to work for a number of implementation reasons; but worse than the implementation troubles is that the grammar would then become fundamentally ambiguous, eg there could be multiple correct parsings of A+B*C depending on what data types A,B,C have. So precedence is hard-wired based on the operator name. I'm really scratching my head over this rule (via http://www.postgresql.org/docs/9.3/static/sql-syntax-lexical.html#SQL-PRECEDENCE): When a schema-qualified operator name is used in the OPERATOR syntax, as for example in: SELECT 3 OPERATOR(pg_catalog.+) 4; the OPERATOR construct is taken to have the default precedence shown in Table 4-2 for any other operator. This is true no matter which specific operator appears inside OPERATOR(). Yeah. I'd rather have said that it's the same precedence as for the undecorated operator name, but again bison doesn't really have a way to do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator precedence issues
Hi, On 2013-08-30 17:35:04 -0500, Merlin Moncure wrote: When a schema-qualified operator name is used in the OPERATOR syntax, as for example in: SELECT 3 OPERATOR(pg_catalog.+) 4; the OPERATOR construct is taken to have the default precedence shown in Table 4-2 for any other operator. This is true no matter which specific operator appears inside OPERATOR(). That rule seems intentionally designed to make it impossible to to override mathematical behaviors. Mainly curious -- was that intentional? You can change your search_path to include your schema before an explicitly listed pg_catalog afair. Not nice, but should work... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY IGNORE
Hi, This is awesome. On 2013-08-30 15:09:59 -0700, Peter Geoghegan wrote: 1) The patch is only interested in unique index violations (specifically, violations of amcanunique access method unique indexes); it will not do anything with NULL constraint violations, as the MySQL feature does, for example. It also doesn't care about exclusion constraints, even though it's currently possible to defer exclusion constraint enforcement in a way that is analogous to how it's done for btree indexes. All that seems sane to me. I very, very much do not want it to deal with NOT NULL violations. [syntax] Haven't thought about syntax at all so far, that's food for another day. Use-cases = ... In addition to that, the feature is may be of value to clients of the logical change-set generation infrastructure (such as the proposed BDR contrib module[3]), if and when that infrastructure is finally committed. Currently, the BDR prototype simply balks at INSERT conflicts. This is because the amount of subtransactions/xids generated to do anything else is considered prohibitive. If Postgres is ever going to solve the kinds of problems that BDR proposes to solve well (except for the relatively narrow use-case where balking at INSERT conflicts is acceptable), it needs something like what I've proposed here. The only alternative is probably to replay a transaction without using subtransactions, and if that fails, remember that fact and wrap every single change in a subxact. This is really going to be painful for large transactions. Exactly. In completely unscientific tests it reduces speed to less than 1/10 of the version without subxacts. Without a single conflict that is. Andres privately said some weeks back (prior to my sharing of this information, and even prior to writing most of the code posted here) that he'd like to see an INSERT...ON DUPLICATE KEY LOCK. I suppose that he must have meant that he'd like to see shared locks obtained on rows that blocked a noserter from actually inserting some of their proposed tuples, for the purposes of conflict resolution (which, as I've said, this doesn't do). Perhaps he'd like to comment on these issues here. Ok, so what we would like to do is basically to follow a protocol (simplified) like: 1) replay INSERT, using ON DUPLICATE IGNORE 2) if INSERT succeeded, be happy, no conflict (another INSERT with the same key might conflict though) 3) if the INSERT failed, lock tuple for UPDATE 4) if the tuple got deleted by another session, goto 1 5) check whether local tuple or the already inserted tuple wins conflict resolution 6) UPDATE or skip accordingly If there's a variant of INSERT ... ON DUPLICATE that gets a FOR UPDATE lock on the existing row this gets simpler because there's no chance anymore we need to loop or look for other versions of the row. Makes sense? Mechanism = [...] From a high level, the patch works by adding something that is tentatively internally referred to as speculative insertion in lower-level code comments. The basic approach is: * Within ExecInsert, when inserting into a regular heap relation from a tuple table slot (after BEFORE triggers fire but before inserting the heap tuple), there is an extra function call to a new function, ExecLockIndexTuples(). * For each applicable amcanunique unique index, ExecLockIndexTuples() calls a new am method, amlock. For each unique index, we end up acquiring a write buffer lock in a similar fashion to _bt_doinsert. When we return from the new amlock function, we report whether or not there was a duplicate without having inserted any index tuples or having otherwise done something that requires a heap tuple to be present. I like to call this point the point of no return, where individual unique indexes commit themselves to insertion (which is to say, the point after which a unique constraint violation becomes impossible but before actual index tuple insertion in _bt_doinsert [9]). Another duplicate key cannot be inserted by anyone else until that write lock is released, which it won't be until insertion or IGNORE, for the same reason why that's currently true once control reaches the point of no return. A pagesplit might need to happen before insertion proper of that index tuple, for example, and that's okay. * If we have total consensus among all unique indexes, we proceed with a heap insertion of the tuple formed from the tuple table slot ExecInsert() was called in respect of. Yes, this means that potentially multiple exclusive buffer locks are held for the duration of the heap tuple insertion. If not, we release the locks early and return NULL from ExecInsert() before heap insertion ever happens (and certainly before index tuple insertion ever happens). * If and when we get to ExecInsertIndexTuples(), and ultimately to _bt_doinsert, the new _bt_doinsert knows that for any unique indexes it sees, that the process has
Re: [HACKERS] INSERT...ON DUPLICATE KEY IGNORE
On Fri, Aug 30, 2013 at 5:47 PM, Andres Freund and...@2ndquadrant.com wrote: This is awesome. Thanks. All that seems sane to me. I very, very much do not want it to deal with NOT NULL violations. Sure. But there's nothing stopping us from doing that as a totally orthogonal thing. Not that I personally find it to be terribly compelling or anything. It's an easy addition, but I'm certainly not going to try and mandate it as integral to what I've done here if that doesn't suit you. Ok, so what we would like to do is basically to follow a protocol (simplified) like: 1) replay INSERT, using ON DUPLICATE IGNORE 2) if INSERT succeeded, be happy, no conflict (another INSERT with the same key might conflict though) 3) if the INSERT failed, lock tuple for UPDATE 4) if the tuple got deleted by another session, goto 1 5) check whether local tuple or the already inserted tuple wins conflict resolution 6) UPDATE or skip accordingly Right, I thought that's what you meant. I'll have to ponder it further. However, I don't think locking the old tuple(s) is mandatory to make this a useful feature - as I've noted, MySQL doesn't do that. That said, I really want to support that, perhaps as another DUPLICATE KEY variant. As I've noted, if we had that, we almost wouldn't need upsert - loosely speaking it would be mere syntactic sugar on top of what you require. If there's a variant of INSERT ... ON DUPLICATE that gets a FOR UPDATE lock on the existing row this gets simpler because there's no chance anymore we need to loop or look for other versions of the row. Makes sense? Perfect sense. Puh. It took me some time to even start to understand what you're doing here... While I ponder on it some more, could you explain whether I understood something correctly? Consider the following scenario: CREATE TABLE foo(a int UNIQUE, b int UNIQUE); S1: INSERT INTO foo(0, 0); S1: BEGIN; S1: INSERT INTO foo(1, 1); S1: SELECT pg_sleep(3600); S2: BEGIN; S2: INSERT INTO foo(2, 1); S3: SELECT * FROM foo WHERE a = 0; Won't S2 in this case exclusively lock a page in foo_a_key (the one where 2 will reside on) for 3600s, while it's waiting for S1 to finish while getting the speculative insertion into foo_b_key? Yes. The way the patch currently handles that case is unacceptable. It needs to be more concerned about holding an exclusive lock on earlier-locked indexes when locking the second and subsequent indexes iff it blocks on another transaction in the course of locking the second and subsequent indexes. ISTM we could use something like a new lock level to make that work more sensibly, basically something like LW_FOR_UPDATE which does not conflict with _SHARED but conflicts with _EXCLUSIVE. I'll ponder it further. There are probably a number of options. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Uh, the pg_dump part checks for version 80400, shouldn't it be 90400? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Alvaro Herrera alvhe...@2ndquadrant.com writes: Uh, the pg_dump part checks for version 80400, shouldn't it be 90400? The reasoning there is that 8.4 is where we added pg_get_function_arguments(), so this dumping code should work against that server version or later. (Oh, memo to self: test that.) It's true that pre-9.4 servers are not going to produce any useful extra info by using pg_get_function_arguments() on an aggregate; but the general idea of this patch is to make the aggregate-related code look more like the plain-function-related code, so using the same version cutoffs in both cases seemed to make sense. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
2013/8/30 Josh Berkus j...@agliodbs.com: On 08/30/2013 03:05 AM, Kohei KaiGai wrote: Surely someone in the security community has discussed this? Security community considers covert channel is a hard to solve problem; nearly impossible to eliminate. Let's see the summary in wikipedia: http://en.wikipedia.org/wiki/Covert_channel Well, in each of the cases covered in that article, the given technology (OSI, TCP, etc.) takes specific provisions to limit the ability of attackers to discover information via the covert channel. However, we have yet to talk about taking any such provisions with Postgres. If we commit this patch, arguably we'll have a row-level security feature which only protects data from well-behaved users, which seems counterproductive. The point we shouldn't forget is information leakage via covert-channel has less grade of threat than leakage via main-channel, because of much less bandwidth. Security community also concludes it is not avoidable nature as long as human can observe system behavior and estimate something, thus, security evaluation criteria does not require eliminate covert-channels or does not pay attention about covert-channels for the products that is installed on the environment with basic robustness (that means, non-military, regular enterprise usage). I don't think PostgreSQL goes into military-grade secure database system. If so, it has to sacrifice many thing (like, performance, usability, nature of open source, ...) for security. It's not a fact. So, arguments in favor of this patch: a) it's as good as Oracle's security features, giving us check-box compliance. b) it allows securing individual rows against attackers with limited technical knowledge or limited database access, and could be very hardened in combination with intelligent access control. Even if attacker has enough knowledge, the ratio they can estimate hidden values is very limited because of much less bandwidth of covert channels. c) it is an improvement on techniques like Veil (is it)? d) we plan to continue improving it and closing covert channels, or limiting their bandwidth. Arguments against: m) covert channels are currently broad enough to make it trivially circumventable (are they?) n) overhead and code maintenance required is substantial So, determinative questions: 1) are the security mechanisms supplied by this patch superior in some way to approaches like Veil for multi-tenant applications? (this is a serious question, as multi-tenant applications are far less concerned about covert channels) Yes. This RLS implementation targets the environment that does not have requirement for a particular bandwidth upperbound on covert- channels. It allows to centralize the place where we have to care about access control on main-channel, so it well fits multi-tenant applications. 2) do we plan to reduce the accessibility of data via covert channels over successive releases? How? Less covert channels is better than massive, if we could close it with reasonable cost; that means run-time performance, code complexity and so on. However, in general, it is impossible to eliminate anything in spite of less degree of threats because of smaller bandwidth. So, I don't think this is an issue to spent much efforts to solve. 3) will accepting this patch allow our users in the Government space to more freely adopt PostgreSQL? Government has two spaces. Most of their environment requires similar requirement as enterprise grade system doing. On the other hand, military environment often requires upper-bound of covert channel, as a story I introduce on upthread, but they are minority and have budget to develop special purpose database system designed for security with first priority. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory
On Fri, Aug 30, 2013 at 9:15 PM, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2013-08-28 15:20:57 -0400, Robert Haas wrote: That way any corruption in that area will prevent restarts without reboot unless you use ipcrm, or such, right? The way I've designed it, no. If what we expect to be the control segment doesn't exist or doesn't conform to our expectations, we just assume that it's not really the control segment after all - e.g. someone rebooted, clearing all the segments, and then an unrelated process (malicious, perhaps, or just a completely different cluster) reused the same name. This is similar to what we do for the main shared memory segment. The case I am mostly wondering about is some process crashing and overwriting random memory. We need to be pretty sure that we'll never fail partially through cleaning up old segments because they are corrupted or because we died halfway through our last cleanup attempt. I think we want that during development, but I'd rather not go there when releasing. After all, we don't support a manual choice between anonymous mmap/sysv shmem either. That's true, but that decision has not been uncontroversial - e.g. the NetBSD guys don't like it, because they have a big performance difference between those two types of memory. We have to balance the possible harm of one more setting against the benefit of letting people do what they want without needing to recompile or modify code. But then, it made them fix the issue afaik :P In addition, I've included an implementation based on mmap of a plain file. As compared with a true shared memory implementation, this obviously has the disadvantage that the OS may be more likely to decide to write back dirty pages to disk, which could hurt performance. However, I believe it's worthy of inclusion all the same, because there are a variety of situations in which it might be more convenient than one of the other implementations. One is debugging. Hm. Not sure what's the advantage over a corefile here. You can look at it while the server's running. That's what debuggers are for. On MacOS X, for example, there seems to be no way to list POSIX shared memory segments, and no easy way to inspect the contents of either POSIX or System V shared memory segments. Shouldn't we ourselves know which segments are around? Sure, that's the point of the control segment. But listing a directory is a lot easier than figuring out what the current control segment contents are. But without a good amount of tooling - like in a debugger... - it's not very interesting to look at those files either way? The mere presence of a segment doesn't tell you much and the contents won't be easily readable. Another use case is working around an administrator-imposed or OS-imposed shared memory limit. If you're not allowed to allocate shared memory, but you are allowed to create files, then this implementation will let you use whatever facilities we build on top of dynamic shared memory anyway. I don't think we should try to work around limits like that. I do. There's probably someone, somewhere in the world who thinks that operating system shared memory limits are a good idea, but I have not met any such person. Let's drive users away from sysv shem is the only one I heard so far ;) I would never advocate deliberately trying to circumvent a carefully-considered OS-level policy decision about resource utilization, but I don't think that's the dynamic here. I think if we insist on predetermining the dynamic shared memory implementation based on the OS, we'll just be inconveniencing people needlessly, or flat-out making things not work. [...] But using file-backed memory will *suck* performancewise. Why should we ever want to offer that to a user? That's what I was arguing about primarily. If we're SURE that a Linux user will prefer posix to sysv or mmap or none in 100% of cases, and that a NetBSD user will always prefer sysv over mmap or none in 100% of cases, then, OK, sure, let's bake it in. But I'm not that sure. I think posix shmem will be preferred to sysv shmem if present, in just about any relevant case. I don't know of any system with lower limits on posix shmem than on sysv. I think this case is roughly similar to wal_sync_method: there really shouldn't be a performance or reliability difference between the ~6 ways of flushing a file to disk, but as it turns out, there is, so we have an option. Well, most of them actually give different guarantees, so it makes sense to have differing performance... Why do we want to expose something unreliable as preferred_address to the external interface? I haven't read the code yet, so I might be missing something here. I shared your opinion that preferred_address is never going to be reliable, although FWIW Noah thinks it can be made reliable with a large-enough
Re: [HACKERS] [v9.4] row level security
2013/8/30 Tom Lane t...@sss.pgh.pa.us: Josh Berkus j...@agliodbs.com writes: On 08/30/2013 12:43 PM, Tom Lane wrote: In short, we can check some check-box is a really, really bad reason to accept a security-related feature. If we're going to put up with all the downsides of RLS, I want the end result to be something that's actually secure, not something that gives the illusion of security. Can you be more explicit about all the downsides of RLS? I was just looking over the patch, which is less than 5000 lines. While it's not small, we have larger patches in the CF. So what's the specific downsides of this? I think it's going to be an ongoing maintenance headache and an endless source of security bugs, even disregarding covert-channel issues. I have pretty much zero faith in the planner changes, in particular, and would still not have a lot if they were adequately documented, which they absolutely are not. The whole thing depends on nowhere-clearly-stated assumptions that plan-time transformations will never allow an RLS check to be bypassed. I foresee future planner work breaking this in non-obvious ways on a regular basis (even granting the assumption that it's bulletproof today, which is at best unproven). In general, we will adopt / enhance features as long as PostgreSQL runs with evolution. It can never be free from bugs or maintenance, regardless of its nature. Later half seems to me a bit unfair because any features may conflict with some future works, not only RLS. Also, if you have some tangible planner enhancement plan, could you inform us which plan shall be in progress? At least, it is impossible to adjust my implementation because of abstract concern towards future conflict. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers