Re: [HACKERS] pg_dump -X
On Friday 24 December 2010 05:35:26 Robert Haas wrote: > Back in 2006, we have this commit: > > commit 2b25e1169f44368c120931787628d51731b5cc8c > Author: Peter Eisentraut > Date: Sat Oct 7 20:59:05 2006 + > > The -X option in pg_dump was supposed to be a workaround for the lack > of portable long options. But we have had portable long options for a > long time now, so this is obsolete. Now people have added options which > *only* work with -X but not as regular long option, so I'm putting a stop > to this: -X is deprecated; it still works, but it has been removed from > the documentation, and please don't add more of them. > > Since then, two additional -X options have crept in, doubtless due to > mimicry of the existing options without examination of the commit > logs. I think we should either (a) remove the -X option altogether or > (b) change the comment so that it clearly states the same message that > appears here in the commit log, namely, that no new -X options are to > be created. The existing comment says that -X is deprecated, but that > doesn't make it entirely 100% clear that the code isn't intended to be > further updated, at least judging by the results. Just a thought, but wouldn't a bit warning that appears when someone actually uses the "-X" option informing the user that it's deprecated prevent further ones to appear? Are there situations where the -X option are actually required? Can't the code be adapted to use different options then the "-X" one? Also, unless Gentoo actually strips the man-page and "--help" page (which I do seriously doubt), I do not see the "-X" option in the documentation. -- Joost -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sepgsql contrib module
(2010/12/24 11:53), KaiGai Kohei wrote: > There is one another issue to be discussed. > We need a special form of regression test. Because SE-PostgreSQL > makes access control decision based on security label of the peer > process, we need to switch psql process during regression test. > (So, I don't include test cases yet.) > > We have 'runcon' command to launch a child process with specified > security label as long as the security policy allows. If we could > launch 'psql' by 'runcon' with specified label, we can describe > test-cases on the existing framework on 'make installcheck'. > > An idea is to add an option to pg_regress to launch psql command > with a specified wrapper program (like 'runcon'). > In this case, each contrib modules kicks with REGRESS_OPTS setting. > One thing to be considered is the security label to be given to > the 'runcon' is flexible for each *.sql files. > The attached patch adds --launcher=COMMAND option to pg_regress. If a command was specified, pg_regress prepends the specified command string in front of psql command. When we use this option, psql command process will launched via the launcher program. Of course, the launcher has responsibility to execute psql correctly.) This example is a case when I run a regression test on cube module. It tries to launch psql using 'runcon -l s0'. [kai...@saba cube]$ make installcheck REGRESS_OPTS="--launcher='runcon -l s0' --dbname=cube_regress" make -C ../../src/test/regress pg_regress make[1]: Entering directory `/home/kaigai/repo/pgsql/src/test/regress' make -C ../../../src/port all make[2]: Entering directory `/home/kaigai/repo/pgsql/src/port' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/home/kaigai/repo/pgsql/src/port' make[1]: Leaving directory `/home/kaigai/repo/pgsql/src/test/regress' ../../src/test/regress/pg_regress --inputdir=. --psqldir=/usr/local/pgsql/bin --launcher='runcon -l s0' --dbname=cube_regress cube (using postmaster on Unix socket, default port) == dropping database "cube_regress" == DROP DATABASE == creating database "cube_regress" == CREATE DATABASE ALTER DATABASE == running regression test queries== test cube ... ok = All 1 tests passed. = During the regression test, I checked security context of the process. [kai...@saba ~]$ env LANG=C pstree -Z systemd(`system_u:system_r:init_t:s0') : |-sshd(`unconfined_u:system_r:sshd_t:s0-s0:c0.c1023') | |-sshd(`unconfined_u:system_r:sshd_t:s0-s0:c0.c1023') | | `-sshd(`unconfined_u:system_r:sshd_t:s0-s0:c0.c1023') | | `-bash(`unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023') | |`-make(`unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023') | | `-pg_regress(`unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023') | | `-psql(`unconfined_u:unconfined_r:unconfined_t:s0') It shows us the launcher program drops privileges of "c0.c1023" on end of the security label of processes between pg_regress and psql. How about the idea to implement regression test for SE-PostgreSQL, or possible other stuff which depends on environment variables. Thanks, -- KaiGai Kohei pg_regress-launcher.patch Description: application/octect-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
* Robert Haas (robertmh...@gmail.com) wrote: > I think I agree with Florian about the confusing-ness of the proposed > semantics. Aren't you saying you want NOLOGIN mean "not allowed to > log in for the purposes of issuing SQL commands, but allowed to log in > for replication"? Uggh. I like the general idea of a replication-only "role" or "login". Maybe implementing that as a role w/ all the things that come along with it being a role isn't right, but we don't want to have to reinvent all the supported auth mechanisms (and please don't propose limiting the auth options for the replication login!). Is there a way we can leverage the auth mechanisms, etc, while forcing the 'replication role' to only be able to do what a 'replication role' should do? Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] pg_dump -X
Back in 2006, we have this commit: commit 2b25e1169f44368c120931787628d51731b5cc8c Author: Peter Eisentraut Date: Sat Oct 7 20:59:05 2006 + The -X option in pg_dump was supposed to be a workaround for the lack of portable long options. But we have had portable long options for a long time now, so this is obsolete. Now people have added options which *only* work with -X but not as regular long option, so I'm putting a stop to this: -X is deprecated; it still works, but it has been removed from the documentation, and please don't add more of them. Since then, two additional -X options have crept in, doubtless due to mimicry of the existing options without examination of the commit logs. I think we should either (a) remove the -X option altogether or (b) change the comment so that it clearly states the same message that appears here in the commit log, namely, that no new -X options are to be created. The existing comment says that -X is deprecated, but that doesn't make it entirely 100% clear that the code isn't intended to be further updated, at least judging by the results. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 11:00 PM, Tom Lane wrote: > Florian Pflug writes: >> The problem here is that you suggest NOLOGIN should mean "Not allowed >> to issue SQL commands", which really isn't what the name "NOLOGIN" >> conveys. > > No, it means "not allowed to connect". It's possible now to issue > commands as a NOLOGIN user, you just have to use SET ROLE to become the > user. I think you're arguing about a design choice that was already > made some time ago. I think I agree with Florian about the confusing-ness of the proposed semantics. Aren't you saying you want NOLOGIN mean "not allowed to log in for the purposes of issuing SQL commands, but allowed to log in for replication"? Uggh. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
Florian Pflug writes: > The problem here is that you suggest NOLOGIN should mean "Not allowed > to issue SQL commands", which really isn't what the name "NOLOGIN" > conveys. No, it means "not allowed to connect". It's possible now to issue commands as a NOLOGIN user, you just have to use SET ROLE to become the user. I think you're arguing about a design choice that was already made some time ago. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal : cross-column stats
On Dec23, 2010, at 20:39 , Tomas Vondra wrote: > Dne 21.12.2010 16:54, Florian Pflug napsal(a): >> I think that maybe it'd be acceptable to scan a large portion of the >> table to estimate dist(A,B), *if* we must only do so only once in a while. >> But even with >> a full scan, how would we arrive at an estimate for dist(A,B)? Keeping all >> values in memory, >> and spilling them into, say, an on-disk hash table adds even more overhead >> to the already >> expensive full scan. Maybe using a bloom filter instead of a hash table >> could avoid >> the spilling to disk, in exchange for a slightly less precise result... > > I've read some basics about a Bloom filters, and I'm not sure we can use > it in this case. I see two problems with this approach: > > 1) impossibility to predict the number of elements in advance > > To build a Bloom filter with limited error rate, you need to size it > properly (number of hash function and size of the bit array). But > that's exactly the the information we're looking for. > > I guess we could use the highest possible value (equal to the number > of tuples) - according to wiki you need about 10 bits per element > with 1% error, i.e. about 10MB of memory for each million of > elements. Drat. I had expected these number to come out quite a bit lower than that, at least for a higher error target. But even with 10% false positive rate, it's still 4.5MB per 1e6 elements. Still too much to assume the filter will always fit into memory, I fear :-( > 2) sampling the whole table > > A naive approach would be to sample the whole table each time, but > for large tables that might be a bit of a problem. So I'm thinking > about what alternatives are there .. > One possibility is to build the Bloom filter incrementally and store > it on disk, or something like that. I guess this could be done > during VACUUM or something like that. Anyway there's an issue how to > set the filter size initially (estimate the number of elements), > just as in the previous section. The filter size could be derived from the table's statistics target, or be otherwise user-definable. We could also auto-resize once it gets too full. But still, that all seems awfully complex :-( > Another possibility is to collect the data from just a small portion > of a table and then use the result to estimate the number of distinct > values for the whole table. But I'm not sure we can do this reliably, > I see many traps in this. This is how it works currently. The problem with this approach is that it gives you very little guarantees about how precise the result will be. Extrapolating works very well for things like MKVs and histograms, because there you're by definition interested mostly in values which occur often - and thus with a high probability in the relative few rows you sample. For the number of distinct values, however, this isn't true - if ndistinct is an order of magnitude smaller than the number of rows, relatively few rows can account for a large percentage of the distinct values... Another idea would be to obtain the ndistinct values from an index somehow. Postgres cannot currently scan an index in physical order, only in logical order, due to locking considerations. But since we'd only be interested in an estimate, maybe a scan in physical block order would work for ndistinc estimates? Just a wild idea, mind you, I haven't checked at all if that'd be even remotely feasible. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Dec24, 2010, at 04:16 , Tom Lane wrote: > Florian Pflug writes: >> On Dec23, 2010, at 16:54 , Tom Lane wrote: >>> BTW, is it possible to set things up so that a REPLICATION account >>> can be NOLOGIN, thereby making it really hard to abuse for other >>> purposes? Or does the login privilege check come too soon? > >> Please don't. This violates the principle of least surprise big time! > > How so? (Please note I said *can be*, not *has to be*.) Because a DBA might "ALTER ROLE replication WITH NOLOGIN", thinking he has just disabled that role. Only to find out weeks later than he hasn't and that someone has been using that role to stream weeks worth of confidential data to who knows where. The problem here is that you suggest NOLOGIN should mean "Not allowed to issue SQL commands", which really isn't what the name "NOLOGIN" conveys. The concept itself is perfectly fine, but the name is dangerously confusing. > The point of this is to ensure that if someone does illicitly come by > the replication role's password, they can't use it to log in. They can > still steal all your data, but they can't actually get into the > database. I don't see why it's a bad idea to configure things that way. It's perfectly fine to configure things that way, and is in fact what I would do. I'd just prefer the name for that setting to convey it's actual meaning which is why I suggested adding a SQL/NOSQL flag. (Or SESSION/NOSESSION, or whatever). Or, much simpler, to prevent WITH REPLICATION roles from issuing SQL commands altogether. That'd achieve your goal just as well and is way less confusing. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
Florian Pflug writes: > On Dec23, 2010, at 16:54 , Tom Lane wrote: >> BTW, is it possible to set things up so that a REPLICATION account >> can be NOLOGIN, thereby making it really hard to abuse for other >> purposes? Or does the login privilege check come too soon? > Please don't. This violates the principle of least surprise big time! How so? (Please note I said *can be*, not *has to be*.) The point of this is to ensure that if someone does illicitly come by the replication role's password, they can't use it to log in. They can still steal all your data, but they can't actually get into the database. I don't see why it's a bad idea to configure things that way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Dec23, 2010, at 16:54 , Tom Lane wrote: > BTW, is it possible to set things up so that a REPLICATION account > can be NOLOGIN, thereby making it really hard to abuse for other > purposes? Or does the login privilege check come too soon? Please don't. This violates the principle of least surprise big time! And, whats worse, violate it in such a way that people *underestimate* the permissions a particular role has, which from a security POV is the worst than can happen. You pointed out yourself that REPLICATION is a powerful permission to have because it essentially grants you read access to the whole cluster. By allowing roles to connect a WAL receivers even if they have NOLOGIN set, you're giving these powerful permission to a role a DBA might think is disabled! Now, I *can* see that having roles which may only connect as WAL receivers, not to issue queries, is worthwhile. However, please either A) Invert a new flag for that, for example SQL/NOSQL. A pure replication role would have WITH REPLICATION NOSQL, while most other would have WITH NOREPLICATION SQL. It's debatable wether postgres should have WITH REPLICATION SQL or WITH NOREPLICATION SQL by default. B) Forbid REPLICATION roles from connecting as anything else than WAL receivers altogether. It'd then probably be a good idea to prevent such roles from having SUPERUSER, CREATEDB, CREATEROLE and INHERIT set as these flag wouldn't then make any sense. The only downside of (B) that I can see is that it'd break setups that work with 9.0. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Tue, Dec 21, 2010 at 21:32, Itagaki Takahiro wrote: > On Tue, Dec 21, 2010 at 20:14, Shigeru HANADA > wrote: >> Attached is the revised version of file_fdw patch. This patch is >> based on Itagaki-san's copy_export-20101220.diff patch. > > #1. Don't you have per-tuple memory leak? I added GetCopyExecutorState() > because the caller needs to reset the per-tuple context periodically. Sorry, I found there are no memory leak here. The related comment is: [execnodes.h] * CurrentMemoryContext should be set to ecxt_per_tuple_memory before * calling ExecEvalExpr() --- see ExecEvalExprSwitchContext(). I guess CurrentMemoryContext in Iterate callback a per-tuple context. So, we don't have to xport cstate->estate via GetCopyExecutorState(). > Or, if you eventually make a HeapTuple from values and nulls arrays, ExecStoreVirtualTuple() seems to be better than the combination of heap_form_tuple() and ExecStoreTuple() for the purpose. Could you try to use slot->tts_values and slot->tts_isnull for NextCopyFrom() directly? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
> If the user exists but is disabled by default, I'm not sure that > really provides any advantage over not having it at all. And it > certainly can't be enabled by default. I was presuming that NOLOGIN wouldn't prevent a replication connections via the replication user. So the user would be "enabled" as far as replication was concerned. And currently, there is no replication connection in the pg_hba.conf. So that's not a change; in fact, having a sample one would be an improvement. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 5:59 PM, Josh Berkus wrote: > >> I'm not entirely sure about this one.. I'm not against it but I'm also >> not really 'for' it. :) > > 2 reasons: (a) if users need to CREATE USER, that *does* add an extra > step and they're more likely to just choose to grant to postgres > instead, (b) having a standard installed user (just like the user > "postgres" is standard) would make our docs, examples and tutorials much > easier to use. If the user exists but is disabled by default, I'm not sure that really provides any advantage over not having it at all. And it certainly can't be enabled by default. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
Greg, All: Results for Solaris 10u8, on ZFS on a 7-drive attached storage array: bash-3.00# ./test_fsync -f /dbdata/pgdata/test.out Loops = 1 Simple write: 8k write 59988.002/second Compare file sync methods using one write: open_datasync 8k write 214.125/second (unavailable: o_direct) open_sync 8k write 222.155/second (unavailable: o_direct) 8k write, fdatasync 214.086/second 8k write, fsync 215.035/second (unavailable: fsync_writethrough) Compare file sync methods using two writes: 2 open_datasync 8k writes 108.227/second (unavailable: o_direct) 2 open_sync 8k writes 106.935/second (unavailable: o_direct) 8k write, 8k write, fdatasync 205.525/second 8k write, 8k write, fsync 210.483/second (unavailable: fsync_writethrough) Compare open_sync with different sizes: open_sync 16k write 211.481/second 2 open_sync 8k writes 106.202/second Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) 8k write, fsync, close 207.499/second 8k write, close, fsync 213.656/second -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database file copy
Thank you, that is a great point. Based on your suggesstion, I wrote the following query: select * from pg_class where relisshared=true order by relname The above query returns 27 rows. I evaluated the impact on the following: pg_auth_members - We create roles and memberships on each deploy instance, so this shouldn't be an issue. pg_authid - As noted in my previous post, issuing alter and grant commands after file copy updates pg_authid with the correct information. pg_database - not an issue, as we are creating the database on the deploy instance, we don't copy the database oid over from the master instance. pg_db_role_setting - We don't have any database specific role settings. Even if we have a need in the future, we will set this up on the deploy instance, so, this shouldn't be an issue. pg_pltemplate - We use plpgsql functions, and it works without any issues after file copy. pg_shdepend - There is one SHARED_DEPENDENCY_PIN(p) entry in this system catalog, and the remaining are SHARED_DEPENDENCY_OWNER (o) entries. Since I am issuing an alter command to change the ownership after file copy to the appropriate role, this system catalog gets populated correctly. I wrote this query "select oid,relname from pg_class where oid in (select objid from pg_shdepend)" on the copied database, and it returns valid results, so this doens't seem to be an issue. As the documentation states, currently, postgres tracks the object to role dependencies, and it may track more types of dependencies in the future. Role dependencies has a fix as stated above, and when new dependencies come about, we will need to evaluate them. pg_shdescription - stores optional comments, which we don't use. pg_tablespace - we are looking to use the default tablespace at this time, which works. Need to evaluate the impact if we need to use custom tablespace. The remaining entries or toast and index entries, which again should not be an impact. Anything else? I am feeling confident about this after each review post. And, whereever I have said "this shouldn't be an issue" above, if you see any discrepancies, kindly highlight. Thanks Srini
Re: [HACKERS] Streaming replication as a separate permissions
> I'm not entirely sure about this one.. I'm not against it but I'm also > not really 'for' it. :) 2 reasons: (a) if users need to CREATE USER, that *does* add an extra step and they're more likely to just choose to grant to postgres instead, (b) having a standard installed user (just like the user "postgres" is standard) would make our docs, examples and tutorials much easier to use. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 23:44, Tom Lane wrote: > Josh Berkus writes: >> On 12/23/10 2:33 PM, Stephen Frost wrote: >>> A better alternative, imv, would be to just have a & d, and mention in >>> the release notes that users *should* create a dedicated replication >>> role which is *not* a superuser but *does* have the replication grant, >>> but if they don't want to change their existing configurations, they can >>> just grant the replication privilege to whatever role they're currently >>> using. > >> Well, if we really want people to change their behavior then we need to >> make it easy for them: > >> 1) have a replication permission >> 2) *by default* create a replication user with the replication >> permission when we initdb. > > Yeah, I could see doing that ... the entry would be wasted if you're not > doing any replication, but one wasted catalog entry isn't much. > > However, it'd be a real good idea for that role to be NOLOGIN if it's > there by default. That shouldn't be too hard - I'll look at making the patch do that to make sure it *isn't* that hard ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
* Tom Lane (t...@sss.pgh.pa.us) wrote: > However, it'd be a real good idea for that role to be NOLOGIN if it's > there by default. Definitely. I'd also suggest we WARN if someone creates a situation where a role has both replication and login, and maybe prevent it altogether, it's just a bad idea.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
> Which means if we just export the macros, we would still not have caught > this. I would like to share all the defines --- I am just saying it > isn't trivial. I just called all the define variables manually rather than relying on the macros. Seemed to work fine. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
* Josh Berkus (j...@agliodbs.com) wrote: > 1) have a replication permission Right, that's what this patch is about. > 2) *by default* create a replication user with the replication > permission when we initdb. I'm not entirely sure about this one.. I'm not against it but I'm also not really 'for' it. :) > 3) have an example line for a replication connection by the replication > user in the default pg_hba.conf (commented out). Sure. > 4) change all our docs and examples to use that replication user. I don't have a problem with that. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Streaming replication as a separate permissions
Josh Berkus writes: > On 12/23/10 2:33 PM, Stephen Frost wrote: >> A better alternative, imv, would be to just have a & d, and mention in >> the release notes that users *should* create a dedicated replication >> role which is *not* a superuser but *does* have the replication grant, >> but if they don't want to change their existing configurations, they can >> just grant the replication privilege to whatever role they're currently >> using. > Well, if we really want people to change their behavior then we need to > make it easy for them: > 1) have a replication permission > 2) *by default* create a replication user with the replication > permission when we initdb. Yeah, I could see doing that ... the entry would be wasted if you're not doing any replication, but one wasted catalog entry isn't much. However, it'd be a real good idea for that role to be NOLOGIN if it's there by default. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On 12/23/10 2:33 PM, Stephen Frost wrote: > A better alternative, imv, would be to just have a & d, and mention in > the release notes that users *should* create a dedicated replication > role which is *not* a superuser but *does* have the replication grant, > but if they don't want to change their existing configurations, they can > just grant the replication privilege to whatever role they're currently > using. Well, if we really want people to change their behavior then we need to make it easy for them: 1) have a replication permission 2) *by default* create a replication user with the replication permission when we initdb. 3) have an example line for a replication connection by the replication user in the default pg_hba.conf (commented out). 4) change all our docs and examples to use that replication user. If using the replication user is easier than any other path, people will. If it's harder, they won't. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
* Josh Berkus (j...@agliodbs.com) wrote: > On 12/23/10 2:21 PM, Tom Lane wrote: > > Well, that's one laudable goal here, but "secure by default" is another > > one that ought to be taken into consideration. > > I don't see how *not* granting the superuser replication permissions > makes things more secure. The superuser can grant replication > permissions to itself, so why is suspending them by default beneficial? > I'm not following your logic here. The point is that the *replication* role can't grant itself superuser privs. Having the replication role compromised isn't great, but if that role is *also* a superuser, then the whole database server could be compromised. Encouraging users to continue to configure remote systems with the ability to connect as a superuser when it's not necessary is a bad idea. One compromise would be to: a) let superusers be granted the replication permission b) have pg_dump assume that superusers have that permission when dumping from a version which pre-dates the replication grant c) have pg_upgrade assume the superuser has that permission when upgrading d) *not* grant replication to the default superuser A better alternative, imv, would be to just have a & d, and mention in the release notes that users *should* create a dedicated replication role which is *not* a superuser but *does* have the replication grant, but if they don't want to change their existing configurations, they can just grant the replication privilege to whatever role they're currently using. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Streaming replication as a separate permissions
Josh Berkus writes: > On 12/23/10 2:21 PM, Tom Lane wrote: >> Well, that's one laudable goal here, but "secure by default" is another >> one that ought to be taken into consideration. > I don't see how *not* granting the superuser replication permissions > makes things more secure. The superuser can grant replication > permissions to itself, so why is suspending them by default beneficial? > I'm not following your logic here. Well, the reverse of that is just as true: if we ship it without replication permissions on the postgres user, people can change that if they'd rather not create a separate role for replication. But I think we should encourage people to NOT do it that way. Setting it up that way by default hardly encourages use of a more secure arrangement. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cannot compile Pg 9.0.2 with MinGW under Windows
On 12/23/2010 07:11 AM, Pavel Golub wrote: Hello, Pgsql-bugs. Tried to use MinGw under windows to build client libraries at least. However failed on "./configure --withou-zlib" stage. Please find attached log file, stdout and stderr outputs. The main problem here I suppose is "configure: WARNING:.h: present but cannot be compiled" Please five me advice on this. Thanks in advance Your gcc doesn't look like others we have: You have: gcc (GCC) 3.4.4 (msys special) Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. configure:3252: $? = 0 configure:3259: gcc -v >&5 Reading specs from /usr/lib/gcc/i686-pc-msys/3.4.4/specs Configured with: /home/cstrauss/build/gcc3/gcc-3.4.4/configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --infodir=/share/info --mandir=/share/man --libexecdir=/lib --enable-languages=c,c++ --disable-nls --enable-threads=posix --enable-sjlj-exceptions --enable-hash-synchronization --enable-libstdcxx-debug --with-newlib Thread model: posix Buildfarm narwhal has: gcc.exe (GCC) 3.4.2 (mingw-special) Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. configure:3252: $? = 0 configure:3259: gcc -v>&5 Reading specs from c:/MinGW/bin/../lib/gcc/mingw32/3.4.2/specs Configured with: ../gcc/configure --with-gcc --with-gnu-ld --with-gnu-as --host=mingw32 --target=mingw32 --prefix=/mingw --enable-threads --disable-nls --enable-languages=c,c++,f77,ada,objc,java --disable-win32-registry --disable-shared --enable-sjlj-exceptions --enable-libgcj --disable-java-awt --without-x --enable-java-gc=boehm --disable-libgcj-debug --enable-interpreter --enable-hash-synchronization --enable-libstdcxx-debug Thread model: win32 gcc version 3.4.2 (mingw-special) Buildfarm frogmouth has: gcc.exe (GCC) 4.5.0 Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. configure:3252: $? = 0 configure:3259: gcc -v>&5 Using built-in specs. COLLECT_GCC=c:\mingw\bin\gcc.exe COLLECT_LTO_WRAPPER=c:/mingw/bin/../libexec/gcc/mingw32/4.5.0/lto-wrapper.exe Target: mingw32 Configured with: ../gcc-4.5.0/configure --enable-languages=c,c++,ada,fortran,objc,obj-c++ --disable-sjlj-exceptions --with-dwarf2 --enable-shared --enable-libgomp --disable-win32-registry --enable-libstdcxx-debug --enable-version-specific-runtime-libs --disable-werror --build=mingw32 --prefix=/mingw Thread model: win32 gcc version 4.5.0 (GCC) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On 12/23/10 2:21 PM, Tom Lane wrote: > Josh Berkus writes: >> If we still make it possible for "postgres" to replicate, then we don't >> add any complexity to the simplest setup. > > Well, that's one laudable goal here, but "secure by default" is another > one that ought to be taken into consideration. I don't see how *not* granting the superuser replication permissions makes things more secure. The superuser can grant replication permissions to itself, so why is suspending them by default beneficial? I'm not following your logic here. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
Josh Berkus writes: > If we still make it possible for "postgres" to replicate, then we don't > add any complexity to the simplest setup. Well, that's one laudable goal here, but "secure by default" is another one that ought to be taken into consideration. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database file copy
Excerpts from Srini Raghavan's message of jue dic 23 18:55:20 -0300 2010: > Please let me know if you or anyone think of any other potential issues. > Thanks > again for reviewing. I think anything in the shared catalogs could be an issue (look for tables with pg_class.relisshared=true). I think you'll need to do something about shared dependencies as well; not sure what. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On 12/23/10 7:49 AM, Robert Haas wrote: > I haven't looked at the patch yet, but I think we should continue to > allow superuser-ness to be *sufficient* for replication - i.e. > superusers will automatically have the replication privilege just as > they do any other - and merely allow this as an option for when you > want to avoid doing it that way. Yes. Currently I already create a separate "replicator" superuser just so that I can simply track which connections belong to replication. It would be great if it could make the "replicator" user less than a superuser. If we still make it possible for "postgres" to replicate, then we don't add any complexity to the simplest setup. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] V3: Idle in transaction cancellation
Andres Freund wrote: > I will try to read the thread and make a proposal for a more > carefull implementation - just not today... I think the results > would be interesting... FWIW, the SSI patch that Dan and I are working on can't have a guarantee that it is immediately safe to retry a transaction which rolls back with a serialization failure (without potentially failing again on exactly the same set of transactions) unless there is a capability such as this "Idle in transaction cancellation" patch would provide. Safe immediate retry would be a nice guarantee for SSI to provide. That being the case, we may not be able to generate the final form of the SSI patch until a patch for this issue is applied. Obviously I know that nobody is in a position to make any promises on this, but I just thought I'd let people know that this issue could possibly be on the critical path to a timely release -- at least if that release will include SSI with the safe retry guarantee. (At least when I'm planning for a release, I like to know such things) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database file copy
Thank you very much for reviewing, appreciate the feedback. As pointed out by you, it is always best to test it out with the latest version, so, I tested the same approach with postgres 9.0.2 on windows just now, and it works! I forgot to mention earlier that in addition to setting vacuum_freeze_table_age to 0, vacuum_freeze_min_age must also be set to 0 to reset xmin with the FrozenXid. And you were spot on with regards to permission issues with roles. I had been testing with the postgres account, which is a superuser and it always works. After the database files are copied over in the deploy instance, any object that had ownership set to a custom role gets messed up, and logging in as that user gives permission denined error. But, there is a easy fix to this. As the postgres user, I ran the alter table owner to command for every object, followed by grant all on to command for every object, which resolved the permission denied issue. Thanks for pointing this out. Please let me know if you or anyone think of any other potential issues. Thanks again for reviewing. Srini
Re: [HACKERS] proposal : cross-column stats
Dne 21.12.2010 19:03, Tomas Vondra napsal(a): > My plan for the near future (a few weeks) is to build a simple 'module' > with the ability to estimate the number of rows for a given condition. > This could actually be quite useful as a stand-alone contrib module, as > the users often ask how to get a number of rows fast (usually for paging). Hm, I've been thinking about where to place this new module - I thought pgfoundry might be the right place, but I've noticed it supports just CVS. Well, that's a bit antiquated SCM I think - I'm very happy with the comfort provided by Git or SVN. Is there some other place commonly used for pg-related projects? I've been thinking about github or something like that ... And I've finally set up the wiki-page about this effort - it's just a summary of this whole thread. http://wiki.postgresql.org/wiki/Cross_Columns_Stats regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log_hostname and pg_stat_activity
On Thu, Dec 23, 2010 at 22:09, Peter Eisentraut wrote: > Somehow I fantasized that log_hostname would also turn > pg_stat_activity.client_addr into names instead of IP addresses. It > doesn't, but perhaps it should? It would be nice to be able to > configure an IP-address free setup. Or would it be worth having a > separate configuration parameter for that? It should certainly be renamed to something else if it does both, but I don't see the point of having two separate parameters between them. As long as you can use a cached version of the lookup, you're only paying the price once, after all... However, pg_stat_activity.client_addr is an inet field, not a text string, so you'd have to invent a separate field for it I think... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] log_hostname and pg_stat_activity
Somehow I fantasized that log_hostname would also turn pg_stat_activity.client_addr into names instead of IP addresses. It doesn't, but perhaps it should? It would be nice to be able to configure an IP-address free setup. Or would it be worth having a separate configuration parameter for that? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On 12/23/2010 08:59 PM, Magnus Hagander wrote: On Thu, Dec 23, 2010 at 16:57, Robert Haas wrote: On Thu, Dec 23, 2010 at 10:54 AM, Tom Lane wrote: Robert Haas writes: I haven't looked at the patch yet, but I think we should continue to allow superuser-ness to be *sufficient* for replication - i.e. superusers will automatically have the replication privilege just as they do any other - and merely allow this as an option for when you want to avoid doing it that way. I don't particularly mind breaking that. If we leave it as-is, we'll be encouraging people to use superuser accounts for things that don't need that, which can't be good from a security standpoint. And if we break it, we'll be adding an additional, mandatory step to make replication work that isn't required today. You might think that's OK, but I think the majority opinion is that it's already excessively complex. Most of the people I run across in the real world are rather surprised how *easy* it is to set up, and not how complex. And tbh, the only complexity complaints I've heard there are about the requirement to start/backup/stop to get it up and running. I've always told everybody to create a separate account to do it, and not heard a single comment about that. I agree - people I talked to are fairly surprised on us not using a dedicated replication role but are surprised at the complexity of actually initializing the replication (mostly the "we cannot do a base backup over the replication connection" missfeature) Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
Magnus Hagander writes: > On Thu, Dec 23, 2010 at 16:57, Robert Haas wrote: >> On Thu, Dec 23, 2010 at 10:54 AM, Tom Lane wrote: >>> I don't particularly mind breaking that. If we leave it as-is, we'll >>> be encouraging people to use superuser accounts for things that don't >>> need that, which can't be good from a security standpoint. >> And if we break it, we'll be adding an additional, mandatory step to >> make replication work that isn't required today. You might think >> that's OK, but I think the majority opinion is that it's already >> excessively complex. > Most of the people I run across in the real world are rather surprised > how *easy* it is to set up, and not how complex. And tbh, the only > complexity complaints I've heard there are about the requirement to > start/backup/stop to get it up and running. I've always told everybody > to create a separate account to do it, and not heard a single comment > about that. FWIW, it seems unreasonable to me to expect that we will not be breaking any part of a 9.0 replication configuration over the next release or two. We *knew* we were shipping a rough version that would require refinements, and this is one of the planned refinements. > That said, how about a compromise in that we add the replication flag > by default to the initial superuser when it's created? That way, it's > at least possible to remove it if you want to. Would that address your > complexity concern? It does nothing to address my security concern. I want to discourage people from using superuser accounts for this, full stop. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP: plpgsql - foreach in
Hello attached patch contains a implementation of iteration over a array: Regards Pavel Stehule *** ./doc/src/sgml/plpgsql.sgml.orig 2010-12-09 08:20:08.0 +0100 --- ./doc/src/sgml/plpgsql.sgml 2010-12-23 21:05:51.459678695 +0100 *** *** 2238,2243 --- 2238,2268 + + Looping Through Array + + <
Re: [HACKERS] Streaming replication as a separate permissions
Magnus Hagander writes: > On Thu, Dec 23, 2010 at 16:15, Tom Lane wrote: >> I think only superusers should be allowed to change the flag. > That was certainly not intentional - and doesn't work that way for me > at least, unless I broke it right before I submitted it. > oh hang on.. Yeah, it's allowing anybody *that has CREATE ROLE* > privilege to do it, I think. And I agree that's wrong and should be > fixed. But I can't see it allowing anybody at all to do it - am I > misreading the code? Ah, sorry, yeah there are probably CREATE ROLE privilege checks somewhere upstream of here. I was expecting to see a privilege check added by the patch itself, and did not, so I complained. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is sorting on two columns so slower than sortingon one column?
Hi Ken, Thanks for your tips! Yes it is the case, and I run another query sorting on the second column whose values are random. postgres=# explain analyze select * from big_wf order by id; QUERY PLAN - Sort (cost=565525.45..575775.45 rows=410 width=8) (actual time=25681.875..36458.824 rows=410 loops=1) Sort Key: id Sort Method: external merge Disk: 72048kB -> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) (actual time=8.595..5569.500 rows=410 loops=1) Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 seconds, right? But one thing I'm confused is that, why is additional comparison really so expensive? Does it incur additional I/O? From the cost model, it seems not, all the "cost" are the same (575775.45). Thanks, Li Jie - Original Message - From: "Kenneth Marshall" To: "Jie Li" Cc: "pgsql-hackers" Sent: Thursday, December 23, 2010 10:03 PM Subject: Re: [HACKERS] Why is sorting on two columns so slower than sortingon one column? > On Thu, Dec 23, 2010 at 02:33:12AM -0500, Jie Li wrote: >> Hi, >> >> Here is the test table, >> >> postgres=# \d big_wf >> Table "public.big_wf" >> Column | Type | Modifiers >> +-+--- >> age| integer | >> id | integer | >> >> postgres=# \dt+ big_wf >> List of relations >> Schema | Name | Type | Owner | Size | Description >> ++---+--++- >> public | big_wf | table | workshop | 142 MB | >> >> >> The first query sorting on one column: >> postgres=# explain analyze select * from big_wf order by age; >>QUERY >> PLAN >> - >> Sort (cost=565525.45..575775.45 rows=410 width=8) (actual >> time=11228.155..16427.149 rows=410 loops=1) >>Sort Key: age >>Sort Method: external sort Disk: 72112kB >>-> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) >> (actual time=6.196..4797.620 rows=410 loops=1) >> Total runtime: 19530.452 ms >> (5 rows) >> >> The second query sorting on two columns: >> postgres=# explain analyze select * from big_wf order by age,id; >>QUERY >> PLAN >> - >> Sort (cost=565525.45..575775.45 rows=410 width=8) (actual >> time=37544.779..48206.702 rows=410 loops=1) >>Sort Key: age, id >>Sort Method: external merge Disk: 72048kB >>-> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) >> (actual time=6.796..5518.663 rows=410 loops=1) >> Total runtime: 51258.000 ms >> (5 rows) >> >> The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the >> first column(age) of all the tuples are of the same value, so the second >> column(id) is always needed for comparison. While the first sorting takes >> about only 6 seconds, the second one takes over 30 seconds, Is this too >> much than expected? Is there any possible optimization ? >> >> Thanks, >> Li Jie > > Hi Li, > > If I understand your description, in the first query the sort does > not actually have to do anything because the column values for "age" > are all degenerate. In the second query, you actually need to sort > the values which is why it takes longer. If the first column values > are the same, then simply sorting by "id" alone would be faster. > You could also bump up work_mem for the query to perform the sort > in memory. > > Regards, > Ken > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is sorting on two columns so slower than sorting on one column?
Hi Marti, Thanks for your help! I guess I understand what you mean, a clustered index will make sorting as cheap as a seq scan, right? But what I meant is, is there any potential optimization for the backend implementation? Intuitively, if sorting on one column or two columns will incur the same I/O costs, why should there be so much difference? Thanks, Li Jie - Original Message - From: "Marti Raudsepp" To: "Jie Li" Cc: "pgsql-hackers" Sent: Thursday, December 23, 2010 10:17 PM Subject: Re: [HACKERS] Why is sorting on two columns so slower than sorting on one column? On Thu, Dec 23, 2010 at 09:33, Jie Li wrote: > While the first sorting takes > about only 6 seconds, the second one takes over 30 seconds, Is this too > much than expected? Is there any possible optimization ? If you're doing these queries often, you should: CREATE INDEX ix_big_wf_age_id ON big_wf (age, id) If that's still not fast enough, you can physically sort rows in the table using the newly created index: CLUSTER big_wf USING ix_big_wf_age_id; Please post back your results. :) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column?
- Original Message - From: "Kenneth Marshall" To: "Li Jie" Cc: "pgsql-hackers" Sent: Thursday, December 23, 2010 10:30 PM Subject: Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column? > On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote: >> Hi Ken, >> >> Thanks for your tips! Yes it is the case, and I run another query sorting on >> the second column whose values are random. >> >> postgres=# explain analyze select * from big_wf order by id; >> QUERY PLAN >> >> - >> Sort (cost=565525.45..575775.45 rows=410 width=8) (actual >> time=25681.875..36458.824 rows=410 loops=1) >> Sort Key: id >> Sort Method: external merge Disk: 72048kB >> -> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) (actual >> time=8.595..5569.500 rows=410 loops=1) >> >> Now the sorting takes about 20 seconds, so it seems reasonable compared to >> 30 seconds, right? But one thing I'm confused is that, why is additional >> comparison really so expensive? Does it incur additional I/O? From the cost >> model, it seems not, all the "cost" are the same (575775.45). >> >> Thanks, >> Li Jie > > In the first query, the cost is basically the I/O cost to read the > table from disk. The actual sort does not do anything since the > sort values are the same. In the second query, the sort has to > swap things in memory/disk to get them in the correct order for > the result. This actually takes CPU and possibly additional I/O > which is why it is slower. In the case of sorting by just the "id" > column, the size of the sorted values is smaller which would need > fewer batches to complete the sort since the sort is bigger than > the work_mem. > > Cheers, > Ken Hi Ken, Thanks for your analysis. But in the last query that sorts on "id", since the query selects all the columns for output, the actual sorted size is the same, and the only difference is the comparison cost. The query sorting on two columns needs to do twice the comparison. Am I right? Thanks, Li Jie -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 16:57, Robert Haas wrote: > On Thu, Dec 23, 2010 at 10:54 AM, Tom Lane wrote: >> Robert Haas writes: >>> I haven't looked at the patch yet, but I think we should continue to >>> allow superuser-ness to be *sufficient* for replication - i.e. >>> superusers will automatically have the replication privilege just as >>> they do any other - and merely allow this as an option for when you >>> want to avoid doing it that way. >> >> I don't particularly mind breaking that. If we leave it as-is, we'll >> be encouraging people to use superuser accounts for things that don't >> need that, which can't be good from a security standpoint. > > And if we break it, we'll be adding an additional, mandatory step to > make replication work that isn't required today. You might think > that's OK, but I think the majority opinion is that it's already > excessively complex. Most of the people I run across in the real world are rather surprised how *easy* it is to set up, and not how complex. And tbh, the only complexity complaints I've heard there are about the requirement to start/backup/stop to get it up and running. I've always told everybody to create a separate account to do it, and not heard a single comment about that. That said, how about a compromise in that we add the replication flag by default to the initial superuser when it's created? That way, it's at least possible to remove it if you want to. Would that address your complexity concern? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 16:15, Tom Lane wrote: > Magnus Hagander writes: >> Here's a patch that changes walsender to require a special privilege >> for replication instead of relying on superuser permissions. We >> discussed this back before 9.0 was finalized, but IIRC we ran out of >> time. The motivation being that you really want to use superuser as >> little as possible - and since being a replication slave is a read >> only role, it shouldn't require the maximum permission available in >> the system. > > Maybe it needn't require "max" permissions, but one of the motivations > for requiring superusernesss was to prevent Joe User from sucking every > last byte of data out of your database (and into someplace he could > examine it at leisure). This patch opens that barn door wide, because > so far as I can see, it allows anybody at all to grant the replication > privilege ... or revoke it, thereby breaking your replication setup. > I think only superusers should be allowed to change the flag. That was certainly not intentional - and doesn't work that way for me at least, unless I broke it right before I submitted it. oh hang on.. Yeah, it's allowing anybody *that has CREATE ROLE* privilege to do it, I think. And I agree that's wrong and should be fixed. But I can't see it allowing anybody at all to do it - am I misreading the code? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Dne 23.12.2010 20:09, Robert Haas napsal(a): > 2010/12/23 Tomas Vondra : >> Dne 20.12.2010 00:03, Tom Lane napsal(a): >>> I wrote: That is not the number of interest. The number of interest is that it's 8 bytes added onto a struct that currently contains 11 of 'em; in other words a 9% increase in the size of the stats file, and consequently about a 9% increase in the cost of updating it. >>> >>> Wups, sorry, I was looking at the wrong struct. It's actually an >>> addition of 1 doubleword to a struct of 21 of 'em, or about 5%. >>> That's still an awful lot in comparison to the prospective usefulness >>> of the information. >>> >>> regards, tom lane >>> >> >> OK, so here goes the simplified patch - it tracks one reset timestamp >> for a background writer and for each database. > > I think you forgot the attachment. Yes, I did. Thanks! Tomas >From da2154954a547c143cd0d130597411911f339c07 Mon Sep 17 00:00:00 2001 From: vampire Date: Thu, 23 Dec 2010 18:40:48 +0100 Subject: [PATCH] Track timestamp of the last stats reset (for each database and a background writer). This is simplified version of the previous patch, that kept a timestamp for each database, table, index and function and for the background writer. This does not keep the timestamp for individual objects, but if stats for a table/index/function are reset, the timestamp for the whole database is updated. --- doc/src/sgml/monitoring.sgml | 18 ++ src/backend/catalog/system_views.sql |6 -- src/backend/postmaster/pgstat.c | 13 + src/backend/utils/adt/pgstatfuncs.c | 26 ++ src/include/catalog/pg_proc.h|4 src/include/pgstat.h |5 + 6 files changed, 70 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 5fd0213..99e091d 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -600,6 +600,15 @@ postgres: user database host + pg_stat_get_db_stat_reset_time(oid) + timestamptz + + Time of the last reset of statistics for this database (as a result of executing + pg_stat_reset function) or any object within it (table or index). + + + + pg_stat_get_numscans(oid) bigint @@ -1062,6 +1071,15 @@ postgres: user database host bgwriter_lru_maxpages parameter + + + pg_stat_get_bgwriter_stat_reset_time() + timestamptz + +Time of the last reset of statistics for the background writer (as a result of executing +pg_stat_reset_shared('bgwriter')) + + pg_stat_get_buf_written_backend() diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 346eaaf..da92438 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -502,7 +502,8 @@ CREATE VIEW pg_stat_database AS pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, -pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted +pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted, + pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset FROM pg_database D; CREATE VIEW pg_stat_user_functions AS @@ -538,7 +539,8 @@ CREATE VIEW pg_stat_bgwriter AS pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync, -pg_stat_get_buf_alloc() AS buffers_alloc; +pg_stat_get_buf_alloc() AS buffers_alloc, + pg_stat_get_bgwriter_stat_reset_time() AS stats_reset; CREATE VIEW pg_user_mappings AS SELECT diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 856daa7..66e0b69 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -3130,6 +3130,8 @@ pgstat_get_db_entry(Oid databaseid, bool create) result->n_tuples_deleted = 0; result->last_autovac_time = 0; + result->stat_reset_timestamp = GetCurrentTimestamp(); + memset(&hash_ctl, 0, sizeof(hash_ctl)); hash_ctl.keysize = sizeof(Oid); hash_ctl.entrysize = sizeof(PgStat_StatTabEntry); @@ -3408,6 +3410,12 @@ pgstat_read_statsfile(Oid onlydb, bool permanent) * load an existing statsfile. */ memset(&globalStats, 0, sizeof(globalStats)); + + /* +* Set the current timestamp (will be kept only in case we can't load an +* existing statsfile. +*/ + globalStats.stat_reset_timestamp = GetCurrentTimestamp(); /* * Try to open the status file. I
Re: [HACKERS] proposal : cross-column stats
Dne 21.12.2010 16:54, Florian Pflug napsal(a): > I think that maybe it'd be acceptable to scan a large portion of the > table to estimate dist(A,B), *if* we must only do so only once in a while. > But even with > a full scan, how would we arrive at an estimate for dist(A,B)? Keeping all > values in memory, > and spilling them into, say, an on-disk hash table adds even more overhead to > the already > expensive full scan. Maybe using a bloom filter instead of a hash table could > avoid > the spilling to disk, in exchange for a slightly less precise result... I've read some basics about a Bloom filters, and I'm not sure we can use it in this case. I see two problems with this approach: 1) impossibility to predict the number of elements in advance To build a Bloom filter with limited error rate, you need to size it properly (number of hash function and size of the bit array). But that's exactly the the information we're looking for. I guess we could use the highest possible value (equal to the number of tuples) - according to wiki you need about 10 bits per element with 1% error, i.e. about 10MB of memory for each million of elements. That's not much but this needs to be done for each column separately and for the combination - for N columns you need (at least) N+1 filters. Sure - increasing the error rate and using a different estimate to build the bloom filter would significantly decrease the memory requirements. 2) sampling the whole table A naive approach would be to sample the whole table each time, but for large tables that might be a bit of a problem. So I'm thinking about what alternatives are there ... One possibility is to build the Bloom filter incrementally and store it on disk, or something like that. I guess this could be done during VACUUM or something like that. Anyway there's an issue how to set the filter size initially (estimate the number of elements), just as in the previous section. Another possibility is to collect the data from just a small portion of a table and then use the result to estimate the number of distinct values for the whole table. But I'm not sure we can do this reliably, I see many traps in this. But maybe I'm just missing something important. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
2010/12/23 Tomas Vondra : > Dne 20.12.2010 00:03, Tom Lane napsal(a): >> I wrote: >>> That is not the number of interest. The number of interest is that it's >>> 8 bytes added onto a struct that currently contains 11 of 'em; in other >>> words a 9% increase in the size of the stats file, and consequently >>> about a 9% increase in the cost of updating it. >> >> Wups, sorry, I was looking at the wrong struct. It's actually an >> addition of 1 doubleword to a struct of 21 of 'em, or about 5%. >> That's still an awful lot in comparison to the prospective usefulness >> of the information. >> >> regards, tom lane >> > > OK, so here goes the simplified patch - it tracks one reset timestamp > for a background writer and for each database. I think you forgot the attachment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Dne 20.12.2010 00:03, Tom Lane napsal(a): > I wrote: >> That is not the number of interest. The number of interest is that it's >> 8 bytes added onto a struct that currently contains 11 of 'em; in other >> words a 9% increase in the size of the stats file, and consequently >> about a 9% increase in the cost of updating it. > > Wups, sorry, I was looking at the wrong struct. It's actually an > addition of 1 doubleword to a struct of 21 of 'em, or about 5%. > That's still an awful lot in comparison to the prospective usefulness > of the information. > > regards, tom lane > OK, so here goes the simplified patch - it tracks one reset timestamp for a background writer and for each database. regards Tomas PS: I've noticed Magnus posted a patch to track recovery conflicts, adding a new view pg_stat_database_conflicts. I have not checked it yet but it should not influence this patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database file copy
On Wed, Dec 22, 2010 at 7:35 PM, Srini Raghavan wrote: > I have tested this and it works, and I am continuing to test it more. I > would like for validation of this idea from the experts and the community to > make sure I haven't overlooked something obvious that might cause issues. Interesting idea. It seems like it might be possible to make this work. One obvious thing to watch out for is object ownership information. Roles are stored in pg_authid, which is a shared catalog, so if you're unlucky you could manage to create a database containing one or more objects that owned by a role ID that doesn't exist in pg_authid, which will probably break things all over the place. There could be other pitfalls as well but that's the only one that's obvious to me off the top of my head... I would strongly recommend basing this on the latest minor release of PostgreSQL 9.0 rather than an outdated minor release of PostgreSQL 8.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 10:54 AM, Tom Lane wrote: > Robert Haas writes: >> I haven't looked at the patch yet, but I think we should continue to >> allow superuser-ness to be *sufficient* for replication - i.e. >> superusers will automatically have the replication privilege just as >> they do any other - and merely allow this as an option for when you >> want to avoid doing it that way. > > I don't particularly mind breaking that. If we leave it as-is, we'll > be encouraging people to use superuser accounts for things that don't > need that, which can't be good from a security standpoint. And if we break it, we'll be adding an additional, mandatory step to make replication work that isn't required today. You might think that's OK, but I think the majority opinion is that it's already excessively complex. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
Robert Haas writes: > I haven't looked at the patch yet, but I think we should continue to > allow superuser-ness to be *sufficient* for replication - i.e. > superusers will automatically have the replication privilege just as > they do any other - and merely allow this as an option for when you > want to avoid doing it that way. I don't particularly mind breaking that. If we leave it as-is, we'll be encouraging people to use superuser accounts for things that don't need that, which can't be good from a security standpoint. BTW, is it possible to set things up so that a REPLICATION account can be NOLOGIN, thereby making it really hard to abuse for other purposes? Or does the login privilege check come too soon? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 23, 2010 at 10:15 AM, Tom Lane wrote: > Magnus Hagander writes: >> Here's a patch that changes walsender to require a special privilege >> for replication instead of relying on superuser permissions. We >> discussed this back before 9.0 was finalized, but IIRC we ran out of >> time. The motivation being that you really want to use superuser as >> little as possible - and since being a replication slave is a read >> only role, it shouldn't require the maximum permission available in >> the system. > > Maybe it needn't require "max" permissions, but one of the motivations > for requiring superusernesss was to prevent Joe User from sucking every > last byte of data out of your database (and into someplace he could > examine it at leisure). This patch opens that barn door wide, because > so far as I can see, it allows anybody at all to grant the replication > privilege ... or revoke it, thereby breaking your replication setup. > I think only superusers should be allowed to change the flag. I haven't looked at the patch yet, but I think we should continue to allow superuser-ness to be *sufficient* for replication - i.e. superusers will automatically have the replication privilege just as they do any other - and merely allow this as an option for when you want to avoid doing it that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column?
Kenneth Marshall writes: > On Thu, Dec 23, 2010 at 10:42:26PM +0800, Li Jie wrote: >> But in the last query that sorts on "id", since the query selects all the >> columns for output, the actual sorted size is the same, and the only >> difference is the comparison cost. The query sorting on two columns needs to >> do twice the comparison. Am I right? > I think you are right. Sorry for the confusion. I doubt the cost of comparing two integers is the issue here; rather it's more likely one of how many merge passes were needed. You could find out instead of just speculating by turning on trace_sort and comparing the log outputs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch BUG #5103: "pg_ctl -w (re)start" fails with custom unix_socket_directory
On Wed, 22 Dec 2010 21:02:35 -0500 (EST) Bruce Momjian wrote: > Alvaro Herrera wrote: > > Excerpts from Quan Zongliang's message of mar dic 21 18:36:11 -0300 2010: > > > On Mon, 29 Nov 2010 10:29:17 -0300 > > > Alvaro Herrera wrote: > > > > > > > > > I think the way this should work is that you call postmaster with a new > > > > switch and it prints out its configuration, after reading the > > > > appropriate config file(s). That way it handles all the little details > > > > such as figuring out the correct config file, hadle include files, etc. > > > > This output would be presumably easier to parse and more trustworthy. > > > > > > Sorry for my late reply. > > > > > > I will check the source of postmaster. > > > > Actually Bruce Momjian is now working on a different fix: > > unix_socket_directory would be added to postmaster.pid, allowing pg_ctl > > to find it. > > Yes, I will apply this patch tomorrow and it will be in 9.1. Thanks for > the report. > Nice work. -- Quan Zongliang -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
Magnus Hagander writes: > Here's a patch that changes walsender to require a special privilege > for replication instead of relying on superuser permissions. We > discussed this back before 9.0 was finalized, but IIRC we ran out of > time. The motivation being that you really want to use superuser as > little as possible - and since being a replication slave is a read > only role, it shouldn't require the maximum permission available in > the system. Maybe it needn't require "max" permissions, but one of the motivations for requiring superusernesss was to prevent Joe User from sucking every last byte of data out of your database (and into someplace he could examine it at leisure). This patch opens that barn door wide, because so far as I can see, it allows anybody at all to grant the replication privilege ... or revoke it, thereby breaking your replication setup. I think only superusers should be allowed to change the flag. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column?
On Thu, Dec 23, 2010 at 10:42:26PM +0800, Li Jie wrote: > - Original Message - > From: "Kenneth Marshall" > To: "Li Jie" > Cc: "pgsql-hackers" > Sent: Thursday, December 23, 2010 10:30 PM > Subject: Re: [HACKERS] Why is sorting on two columns so slower thansortingon > one column? > > > > On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote: > >> Hi Ken, > >> > >> Thanks for your tips! Yes it is the case, and I run another query sorting > >> on the second column whose values are random. > >> > >> postgres=# explain analyze select * from big_wf order by id; > >> QUERY PLAN > >> > >> - > >> Sort (cost=565525.45..575775.45 rows=410 width=8) (actual > >> time=25681.875..36458.824 rows=410 loops=1) > >> Sort Key: id > >> Sort Method: external merge Disk: 72048kB > >> -> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) > >> (actual time=8.595..5569.500 rows=410 loops=1) > >> > >> Now the sorting takes about 20 seconds, so it seems reasonable compared to > >> 30 seconds, right? But one thing I'm confused is that, why is additional > >> comparison really so expensive? Does it incur additional I/O? From the > >> cost model, it seems not, all the "cost" are the same (575775.45). > >> > >> Thanks, > >> Li Jie > > > > In the first query, the cost is basically the I/O cost to read the > > table from disk. The actual sort does not do anything since the > > sort values are the same. In the second query, the sort has to > > swap things in memory/disk to get them in the correct order for > > the result. This actually takes CPU and possibly additional I/O > > which is why it is slower. In the case of sorting by just the "id" > > column, the size of the sorted values is smaller which would need > > fewer batches to complete the sort since the sort is bigger than > > the work_mem. > > > > Cheers, > > Ken > > Hi Ken, > > Thanks for your analysis. > > But in the last query that sorts on "id", since the query selects all the > columns for output, the actual sorted size is the same, and the only > difference is the comparison cost. The query sorting on two columns needs to > do twice the comparison. Am I right? > > Thanks, > Li Jie I think you are right. Sorry for the confusion. Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pl/python custom exceptions for SPI
Here's a patch implementing custom Python exceptions for SPI errors mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the explicit-subxacts patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/custom-spi-exceptions. What the patch does is provide a Python exception per each error defined in utils/errcodes.h, and then raise the corresponding exception when a SPI call fails. The parsing of errcodes.h in the Makefile is a little grotty and would probably have to be ported to the Windows build system, which I have no idea about. With this patch you can do: from plpy import spiexceptions try: plpy.execute("insert into t values (4)") catch spiexceptions.UniqueViolation: plpy.notice("unique constraint violation") catch spiexceptions.NotNullViolation: plpy.notice("not null constraint violation") All exceptions inherint from plpy.SPIError, so code thta just catches a blanket SPIErorr will continue to work. The patch lacks user-facing docs, if it gets accepted I'll write some. Not sure if we should provide a table similar to http://www.postgresql.org/docs/current/static/errcodes-appendix.html, or just refer to that table and state that the rule is changing underscores to camel case... Also, I just realised that this patch does not really depend on the explicit-subxacts change, but rather only on the spi-in-subxacts, so if need be I can generate it as an incremental changeset ofer the latter and not the former. Cheers, Jan diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile index 33dddc6..dd5b445 100644 *** a/src/pl/plpython/Makefile --- b/src/pl/plpython/Makefile *** rpathdir = $(python_libdir) *** 38,44 NAME = plpython$(python_majorversion) OBJS = plpython.o ! # Python on win32 ships with import libraries only for Microsoft Visual C++, # which are not compatible with mingw gcc. Therefore we need to build a --- 38,44 NAME = plpython$(python_majorversion) OBJS = plpython.o ! SPIEXCEPTIONS = spiexceptions.h # Python on win32 ships with import libraries only for Microsoft Visual C++, # which are not compatible with mingw gcc. Therefore we need to build a *** PSQLDIR = $(bindir) *** 86,93 include $(top_srcdir)/src/Makefile.shlib ! all: all-lib install: all installdirs install-lib ifeq ($(python_majorversion),2) --- 86,113 include $(top_srcdir)/src/Makefile.shlib + # A quite horrendous sed, but does the job. The steps are, in order: + # 1. Remove everything up to the line with Class 03. We only generate + #exceptions for errors, not for warnings or notices + # 2. Remove lines that don't define an error code + # 3. Change ERRCODE_XXX into { "spiexceptions.ERRCODE_XY_Z, "XY_Z", ERRCODE_XY_Z }, + # 4. Leave an uppercase letter after a dot or a quote, change the rest + #into lowercase thus giving us + #{ "spiexceptions.Errcode_xy_z, "Xy_z", ERRCODE_XY_Z }, + # 5. change lowercase letters after an underscore into uppercase, giving us + #{ "spiexceptions.ErrcodeXyZ, "XyZ", ERRCODE_XY_Z }, + gen-spiexceptions: + echo "/* autogenerated from utils/errcodes.h, do not edit */" > $(SPIEXCEPTIONS) + sed -e '1,/Class 03/ d' \ + -e '/^#define ERRCODE_.*MAKE_SQLSTATE/! d' \ + -e 's|#define ERRCODE_\([^\t ]*\).*|{ "spiexceptions.\1", "\1", ERRCODE_\1 },|' \ + -e 's|\(["\.]\)\([A-Z]\)\([^"]*\)|\1\2\L\3|g' \ + -e 's|_\([a-z]\)|\u\1|g' \ + $(top_srcdir)/src/include/utils/errcodes.h >> $(SPIEXCEPTIONS) ! .PHONY: gen-spiexceptions ! ! all: gen-spiexceptions all-lib install: all installdirs install-lib ifeq ($(python_majorversion),2) *** clean distclean maintainer-clean: clean- *** 138,143 --- 158,164 rm -f $(OBJS) rm -rf results rm -f regression.diffs regression.out + rm -f $(SPIEXCEPTIONS) ifeq ($(PORTNAME), win32) rm -f python${pytverstr}.def endif diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index 7fc8337..718ebce 100644 *** a/src/pl/plpython/expected/plpython_error.out --- b/src/pl/plpython/expected/plpython_error.out *** CREATE FUNCTION sql_syntax_error() RETUR *** 8,14 'plpy.execute("syntax error")' LANGUAGE plpythonu; SELECT sql_syntax_error(); ! ERROR: plpy.SPIError: syntax error at or near "syntax" CONTEXT: PL/Python function "sql_syntax_error" /* check the handling of uncaught python exceptions */ --- 8,14 'plpy.execute("syntax error")' LANGUAGE plpythonu; SELECT sql_syntax_error(); ! ERROR: spiexceptions.SyntaxError: syntax error at or near "syntax" CONTEXT: PL/Python function "sql_syntax_error" /* check the handling of uncaught python exceptions */ *** CREATE FUNCTION exception_index_invalid_ *** 27,33 return rv[0]' LANGUAGE plpythonu; SEL
[HACKERS] pl/python explicit subtransactions
Here's a patch implementing explicitly starting subtransactions mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the spi-in-subxacts patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/explicit-subxacts. The idea has been proposed in http://archives.postgresql.org/pgsql-hackers/2010-11/msg00122.php This patch provides a subtransaction context manager, in the vein of http://www.python.org/dev/peps/pep-0343/. When inside an explicit subtransaction, SPI calls do not start another one, so you pay the subxact start overhead only once, and you get atomic behaviour. For instance this: with plpy.subxact(): plpy.execute("insert into t values (1)") plpy.execute("insert into t values (2)") plpy.execute("ooops") will not insert any rows into t. Just so you realise it, it *will* raise the exception from the last execute, if you want to continue execution you need to put your with block in a try/catch. If the code starts a subtransaction but fails to close it, PL/Python will forcibly roll it back to leave the backend in a clean state. The patch lacks user-facing documentation, I'll add that later if it gets accepted. For more usage examples refer to the unit tests that the patch adds. Cheers, Jan diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile index 16d78ae..33dddc6 100644 *** a/src/pl/plpython/Makefile --- b/src/pl/plpython/Makefile *** REGRESS = \ *** 79,84 --- 79,85 plpython_types \ plpython_error \ plpython_unicode \ + plpython_subxact \ plpython_drop # where to find psql for running the tests PSQLDIR = $(bindir) diff --git a/src/pl/plpython/expected/README b/src/pl/plpython/expected/README index f011528..362cc0d 100644 *** a/src/pl/plpython/expected/README --- b/src/pl/plpython/expected/README *** plpython_unicode_2.out Python 2.2 *** 6,8 --- 6,11 plpython_unicode_3.out Python 2.3 through 3.1 plpython_types_3.out Python 3.1 + + plpython_subxact.out Python 2.6 through 3.1 + plpython_subxact_0.out older Pythons that don't have the with statement diff --git a/src/pl/plpython/expected/plpython_subxact.out b/src/pl/plpython/expected/plpython_subxact.out index ...25a5a4b . *** a/src/pl/plpython/expected/plpython_subxact.out --- b/src/pl/plpython/expected/plpython_subxact.out *** *** 0 --- 1,321 + -- test explicit subtransaction starting + /* Test table to see if transactions get properly rolled back + */ + CREATE TABLE subxact_tbl ( + i integer + ); + /* Explicit case for Python <2.6 + */ + CREATE FUNCTION subxact_test(what_error text = NULL) RETURNS text + AS $$ + import sys + subxact = plpy.subxact() + subxact.__enter__() + exc = True + try: + try: + plpy.execute("insert into subxact_tbl values(1)") + plpy.execute("insert into subxact_tbl values(2)") + if what_error == "SPI": + plpy.execute("insert into subxact_tbl values('oops')") + elif what_error == "Python": + plpy.attribute_error + except: + exc = False + subxact.__exit__(*sys.exc_info()) + raise + finally: + if exc: + subxact.__exit__(None, None, None) + $$ LANGUAGE plpythonu; + SELECT subxact_test(); + subxact_test + -- + + (1 row) + + SELECT * FROM subxact_tbl; + i + --- + 1 + 2 + (2 rows) + + TRUNCATE subxact_tbl; + SELECT subxact_test('SPI'); + ERROR: plpy.SPIError: invalid input syntax for integer: "oops" + CONTEXT: PL/Python function "subxact_test" + SELECT * FROM subxact_tbl; + i + --- + (0 rows) + + TRUNCATE subxact_tbl; + SELECT subxact_test('Python'); + ERROR: AttributeError: 'module' object has no attribute 'attribute_error' + CONTEXT: PL/Python function "subxact_test" + SELECT * FROM subxact_tbl; + i + --- + (0 rows) + + TRUNCATE subxact_tbl; + /* Context manager case for Python >=2.6 + */ + CREATE FUNCTION subxact_ctx_test(what_error text = NULL) RETURNS text + AS $$ + with plpy.subxact(): + plpy.execute("insert into subxact_tbl values(1)") + plpy.execute("insert into subxact_tbl values(2)") + if what_error == "SPI": + plpy.execute("insert into subxact_tbl values('oops')") + elif what_error == "Python": + plpy.attribute_error + $$ LANGUAGE plpythonu; + SELECT subxact_ctx_test(); + subxact_ctx_test + -- + + (1 row) + + SELECT * FROM subxact_tbl; + i + --- + 1 + 2 + (2 rows) + + TRUNCATE subxact_tbl; + SELECT subxact_ctx_test('SPI'); + ERROR: plpy.SPIError: invalid input syntax for integer: "oops" + CONTEXT: PL/Python function "subxact_ctx_test" + SELECT * FROM subxact_tbl; + i + --- + (0 rows) + + TRUNCATE subxact_tbl; + SELECT subxact_ctx_test('Python'); + ERROR: AttributeError: 'module' object has no attribute 'attribute_error' + CONTEXT: PL/Python function "subxact_ctx_test" + SELECT * FROM subxact_tbl; + i + ---
Re: [HACKERS] Why is sorting on two columns so slower than sortingon one column?
On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote: > Hi Ken, > > Thanks for your tips! Yes it is the case, and I run another query sorting on > the second column whose values are random. > > postgres=# explain analyze select * from big_wf order by id; > QUERY PLAN > > - > Sort (cost=565525.45..575775.45 rows=410 width=8) (actual > time=25681.875..36458.824 rows=410 loops=1) > Sort Key: id > Sort Method: external merge Disk: 72048kB > -> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) (actual > time=8.595..5569.500 rows=410 loops=1) > > Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 > seconds, right? But one thing I'm confused is that, why is additional > comparison really so expensive? Does it incur additional I/O? From the cost > model, it seems not, all the "cost" are the same (575775.45). > > Thanks, > Li Jie In the first query, the cost is basically the I/O cost to read the table from disk. The actual sort does not do anything since the sort values are the same. In the second query, the sort has to swap things in memory/disk to get them in the correct order for the result. This actually takes CPU and possibly additional I/O which is why it is slower. In the case of sorting by just the "id" column, the size of the sorted values is smaller which would need fewer batches to complete the sort since the sort is bigger than the work_mem. Cheers, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST insert algorithm rewrite
On 21.12.2010 20:00, Heikki Linnakangas wrote: One final version, with a bug fix wrt. root page split and some cleanup. I'm planning to commit this before Christmas. It's a big patch, so review would be much appreciated. Committed. Phew! Review & testing is of course still appreciated, given how big and complicated this was. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is sorting on two columns so slower than sorting on one column?
On Thu, Dec 23, 2010 at 09:33, Jie Li wrote: > While the first sorting takes > about only 6 seconds, the second one takes over 30 seconds, Is this too > much than expected? Is there any possible optimization ? If you're doing these queries often, you should: CREATE INDEX ix_big_wf_age_id ON big_wf (age, id) If that's still not fast enough, you can physically sort rows in the table using the newly created index: CLUSTER big_wf USING ix_big_wf_age_id; Please post back your results. :) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pl/python custom datatype parsers
Here's a patch implementing custom parsers for data types mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the plpython-refactor patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/custom-parsers. The idea has been discussed in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01307.php. With that patch, when built with --with-python, the hstore module includes code that adds a GUC called plpython.hstore. This GUC should be set to the full name of the hstore datatype, for instance plpython.hstore = 'public.hstore'. If it is set, the datatype's OID is looked up and hstore sets up a rendezvous variable called PLPYTHON__PARSERS that points to two functions that can convert a hstore Datum to a PyObject and back. PL/Python ot the other hand when it sees an argument with an unknown type tries to look up a rendezvous variable using the type's OID and if it finds it, it uses the parser functions pointed at by that variable. Long story short, it works so: LOAD 'hstore'; SET plpython.hstore = 'public.hstore' CREATE FUNCTION pick_one(h hstore, key text) RETURNS hstore AS $$ return {key: h[key]} $$ LANGUAGE plpythonu; SELECT pick_one('a=>3,b=>4', 'b') -- gives bask a hstore 'b=>4' There's some ugliness with how hstore's Makefile handles building it, and I'm not sure what's needed to make it work with the Windows build system. Also, documentation is missing. It's already usable, but if we decide to commit that, I'll probably need some help with Windows and docs. I first tried to make hstore generate a separate .so with that functionality if --with-python was specified, but couldn't convince the Makefile to do that. So if you configure the tree with --with-python, hstore will link to libpython, maybe that's OK? Cheers, Jan PS: of course, once committed we can add custom parsers for isbn, citext, uuids, cubes, and other weird things. J diff --git a/contrib/hstore/Makefile b/contrib/hstore/Makefile index e466b6f..dbeeb89 100644 *** a/contrib/hstore/Makefile --- b/contrib/hstore/Makefile *** top_builddir = ../.. *** 5,12 include $(top_builddir)/src/Makefile.global MODULE_big = hstore OBJS = hstore_io.o hstore_op.o hstore_gist.o hstore_gin.o hstore_compat.o \ ! crc32.o DATA_built = hstore.sql DATA = uninstall_hstore.sql --- 5,21 include $(top_builddir)/src/Makefile.global MODULE_big = hstore + OBJS = hstore_io.o hstore_op.o hstore_gist.o hstore_gin.o hstore_compat.o \ ! hstore_plpython.o crc32.o ! ! ifeq ($(with_python),yes) ! ! PG_CPPFLAGS := -I$(srcdir) -I$(top_builddir)/src/pl/plpython \ ! $(python_includespec) -DHSTORE_PLPYTHON_SUPPORT ! SHLIB_LINK = $(python_libspec) $(python_additional_libs) \ ! $(filter -lintl,$(LIBS)) $(CPPFLAGS) ! endif DATA_built = hstore.sql DATA = uninstall_hstore.sql diff --git a/contrib/hstore/hstore.h b/contrib/hstore/hstore.h index 8906397..6edfc70 100644 *** a/contrib/hstore/hstore.h --- b/contrib/hstore/hstore.h *** extern Pairs *hstoreArrayToPairs(ArrayTy *** 174,179 --- 174,182 #define HStoreExistsAllStrategyNumber 11 #define HStoreOldContainsStrategyNumber 13 /* backwards compatibility */ + /* PL/Python support */ + extern void hstore_plpython_init(void); + /* * defining HSTORE_POLLUTE_NAMESPACE=0 will prevent use of old function names; * for now, we default to on for the benefit of people restoring old dumps diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c index 0d6f0b6..92c8db9 100644 *** a/contrib/hstore/hstore_io.c --- b/contrib/hstore/hstore_io.c *** PG_MODULE_MAGIC; *** 20,25 --- 20,26 /* old names for C functions */ HSTORE_POLLUTE(hstore_from_text, tconvert); + void _PG_init(void); typedef struct { *** hstore_send(PG_FUNCTION_ARGS) *** 1211,1213 --- 1212,1220 PG_RETURN_BYTEA_P(pq_endtypsend(&buf)); } + + void + _PG_init(void) + { + hstore_plpython_init(); + } diff --git a/contrib/hstore/hstore_plpython.c b/contrib/hstore/hstore_plpython.c index ...081a33e . *** a/contrib/hstore/hstore_plpython.c --- b/contrib/hstore/hstore_plpython.c *** *** 0 --- 1,249 + /* + * contrib/src/hstore_plpython.c + * + * bidirectional transformation between hstores and Python dictionary objects + */ + + /* Only build if PL/Python support is needed */ + #if defined(HSTORE_PLPYTHON_SUPPORT) + + #if defined(_MSC_VER) && defined(_DEBUG) + /* Python uses #pragma to bring in a non-default libpython on VC++ if + * _DEBUG is defined */ + #undef _DEBUG + /* Also hide away errcode, since we load Python.h before postgres.h */ + #define errcode __msvc_errcode + #include + #undef errcode + #define _DEBUG + #elif defined (_MSC_VER) + #define errcode __msvc_errcode + #include + #undef errcode + #else + #include + #endif + + #include "postgres.h" + #include "utils/guc.h" + #inc
Re: [HACKERS] Why is sorting on two columns so slower than sorting on one column?
On Thu, Dec 23, 2010 at 02:33:12AM -0500, Jie Li wrote: > Hi, > > Here is the test table, > > postgres=# \d big_wf > Table "public.big_wf" > Column | Type | Modifiers > +-+--- > age| integer | > id | integer | > > postgres=# \dt+ big_wf > List of relations > Schema | Name | Type | Owner | Size | Description > ++---+--++- > public | big_wf | table | workshop | 142 MB | > > > The first query sorting on one column: > postgres=# explain analyze select * from big_wf order by age; >QUERY > PLAN > - > Sort (cost=565525.45..575775.45 rows=410 width=8) (actual > time=11228.155..16427.149 rows=410 loops=1) >Sort Key: age >Sort Method: external sort Disk: 72112kB >-> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) > (actual time=6.196..4797.620 rows=410 loops=1) > Total runtime: 19530.452 ms > (5 rows) > > The second query sorting on two columns: > postgres=# explain analyze select * from big_wf order by age,id; >QUERY > PLAN > - > Sort (cost=565525.45..575775.45 rows=410 width=8) (actual > time=37544.779..48206.702 rows=410 loops=1) >Sort Key: age, id >Sort Method: external merge Disk: 72048kB >-> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) > (actual time=6.796..5518.663 rows=410 loops=1) > Total runtime: 51258.000 ms > (5 rows) > > The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the > first column(age) of all the tuples are of the same value, so the second > column(id) is always needed for comparison. While the first sorting takes > about only 6 seconds, the second one takes over 30 seconds, Is this too > much than expected? Is there any possible optimization ? > > Thanks, > Li Jie Hi Li, If I understand your description, in the first query the sort does not actually have to do anything because the column values for "age" are all degenerate. In the second query, you actually need to sort the values which is why it takes longer. If the first column values are the same, then simply sorting by "id" alone would be faster. You could also bump up work_mem for the query to perform the sort in memory. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pl/python table functions
Here's a patch implementing table functions mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the plpython-refactor patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/table-functions. This allows functions with multiple OUT parameters returning both one or multiple records (RECORD or SETOF RECORD). There's one inconvenience, which is that if you return a record that has fields of composite types, the I/O functions for these types will be looked up on each execution. Changing that would require some juggling of the PL/Python structures, so I just left it at that. Note that returning just the composite type (or a set of them) does cache the I/O funcs. You get the repeated lookups only if the function returns an unnamed record, that has composite field among others, so something like CREATE FUNCTION x(OUT x table_type, OUT y integer) RETURNS RECORD which I think is fairly uncommon. Cheers, Jan diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile index 16d78ae..167393e 100644 *** a/src/pl/plpython/Makefile --- b/src/pl/plpython/Makefile *** REGRESS = \ *** 79,84 --- 79,85 plpython_types \ plpython_error \ plpython_unicode \ + plpython_composite \ plpython_drop # where to find psql for running the tests PSQLDIR = $(bindir) diff --git a/src/pl/plpython/expected/plpython_composite.out b/src/pl/plpython/expected/plpython_composite.out index ...70a4571 . *** a/src/pl/plpython/expected/plpython_composite.out --- b/src/pl/plpython/expected/plpython_composite.out *** *** 0 --- 1,275 + CREATE FUNCTION multiout_simple(OUT i integer, OUT j integer) AS $$ + return (1, 2) + $$ LANGUAGE plpythonu; + SELECT multiout_simple(); + multiout_simple + - + (1,2) + (1 row) + + SELECT * FROM multiout_simple(); + i | j + ---+--- + 1 | 2 + (1 row) + + SELECT i, j + 2 FROM multiout_simple(); + i | ?column? + ---+-- + 1 |4 + (1 row) + + SELECT (multiout_simple()).j + 3; + ?column? + -- + 5 + (1 row) + + CREATE FUNCTION multiout_simple_setof(n integer = 1, OUT integer, OUT integer) RETURNS SETOF record AS $$ + return [(1, 2)] * n + $$ LANGUAGE plpythonu; + SELECT multiout_simple_setof(); + multiout_simple_setof + --- + (1,2) + (1 row) + + SELECT * FROM multiout_simple_setof(); + column1 | column2 + -+- +1 | 2 + (1 row) + + SELECT * FROM multiout_simple_setof(3); + column1 | column2 + -+- +1 | 2 +1 | 2 +1 | 2 + (3 rows) + + CREATE FUNCTION multiout_record_as(typ text, +first text, OUT first text, +second integer, OUT second integer, +retnull boolean) RETURNS record AS $$ + if retnull: + return None + if typ == 'dict': + return { 'first': first, 'second': second, 'additionalfield': 'must not cause trouble' } + elif typ == 'tuple': + return ( first, second ) + elif typ == 'list': + return [ first, second ] + elif typ == 'obj': + class type_record: pass + type_record.first = first + type_record.second = second + return type_record + $$ LANGUAGE plpythonu; + SELECT * from multiout_record_as('dict', 'foo', 1, 'f'); + first | second + ---+ + foo | 1 + (1 row) + + SELECT multiout_record_as('dict', 'foo', 1, 'f'); + multiout_record_as + + (foo,1) + (1 row) + + SELECT *, s IS NULL as snull from multiout_record_as('tuple', 'xxx', NULL, 'f') AS f(f, s); + f | s | snull + -+---+--- + xxx | | t + (1 row) + + SELECT *, f IS NULL as fnull, s IS NULL as snull from multiout_record_as('tuple', 'xxx', 1, 't') AS f(f, s); + f | s | fnull | snull + ---+---+---+--- +| | t | t + (1 row) + + SELECT * from multiout_record_as('obj', NULL, 10, 'f'); + first | second + ---+ +| 10 + (1 row) + + CREATE FUNCTION multiout_setof(n integer, +OUT power_of_2 integer, +OUT length integer) RETURNS SETOF record AS $$ + for i in range(n): + power = 2 ** i + length = plpy.execute("select length('%d')" % power)[0]['length'] + yield power, length + $$ LANGUAGE plpythonu; + SELECT * FROM multiout_setof(3); + power_of_2 | length + + + 1 | 1 + 2 | 1 + 4 | 1 + (3 rows) + + SELECT multiout_setof(5); + multiout_setof + + (1,1) + (2,1) + (4,1) + (8,1) + (16,2) + (5 rows) + + CREATE FUNCTION multiout_return_table() RETURNS TABLE (x integer, y text) AS $$ + return [{'x': 4, 'y' :'four'}, + {'x': 7, 'y' :'seven'}, + {'x': 0, 'y' :'zero'}] + $$ LANGUAGE plpythonu; + SELECT * FROM multiout_return_table(); +
[HACKERS] pl/python tracebacks
Here's a patch implementing traceback support for PL/Python mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the plpython-refactor patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/tracebacks. It's a variant of http://archives.postgresql.org/pgsql-patches/2006-02/msg00288.php with a few more twists. For errors originating from Python exceptions add the traceback as the message detail. The patch tries to mimick Python's traceback.py module behaviour as close as possible, icluding interleaving stack frames with source code lines in the detail message. Any Python developer should instantly recognize these kind of error reporting, it looks almost the same as an error in the interactive Python shell. A future optimisation might be not splitting the procedure source each time a traceback is generated, but for now it's probably not the most important scenario to optimise for. Cheers, Jan diff --git a/src/pl/plpython/expected/plpython_do.out b/src/pl/plpython/expected/plpython_do.out index a21b088..fb0f0e5 100644 *** a/src/pl/plpython/expected/plpython_do.out --- b/src/pl/plpython/expected/plpython_do.out *** NOTICE: This is plpythonu. *** 3,6 --- 3,9 CONTEXT: PL/Python anonymous code block DO $$ nonsense $$ LANGUAGE plpythonu; ERROR: NameError: global name 'nonsense' is not defined + DETAIL: Traceback (most recent call last): + PL/Python anonymous code block, line 1, in + nonsense CONTEXT: PL/Python anonymous code block diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index 70890a8..fe8a91f 100644 *** a/src/pl/plpython/expected/plpython_error.out --- b/src/pl/plpython/expected/plpython_error.out *** SELECT sql_syntax_error(); *** 11,16 --- 11,19 WARNING: plpy.SPIError: unrecognized error in PLy_spi_execute_query CONTEXT: PL/Python function "sql_syntax_error" ERROR: plpy.SPIError: syntax error at or near "syntax" + DETAIL: Traceback (most recent call last): + PL/Python function "sql_syntax_error", line 1, in + plpy.execute("syntax error") CONTEXT: PL/Python function "sql_syntax_error" /* check the handling of uncaught python exceptions */ *** CREATE FUNCTION exception_index_invalid( *** 20,25 --- 23,31 LANGUAGE plpythonu; SELECT exception_index_invalid('test'); ERROR: IndexError: list index out of range + DETAIL: Traceback (most recent call last): + PL/Python function "exception_index_invalid", line 1, in + return args[1] CONTEXT: PL/Python function "exception_index_invalid" /* check handling of nested exceptions */ *** SELECT exception_index_invalid_nested(); *** 32,37 --- 38,46 WARNING: plpy.SPIError: unrecognized error in PLy_spi_execute_query CONTEXT: PL/Python function "exception_index_invalid_nested" ERROR: plpy.SPIError: function test5(unknown) does not exist + DETAIL: Traceback (most recent call last): + PL/Python function "exception_index_invalid_nested", line 1, in + rv = plpy.execute("SELECT test5('foo')") CONTEXT: PL/Python function "exception_index_invalid_nested" /* a typo */ *** SELECT invalid_type_uncaught('rick'); *** 50,55 --- 59,67 WARNING: plpy.SPIError: unrecognized error in PLy_spi_prepare CONTEXT: PL/Python function "invalid_type_uncaught" ERROR: plpy.SPIError: type "test" does not exist + DETAIL: Traceback (most recent call last): + PL/Python function "invalid_type_uncaught", line 3, in + SD["plan"] = plpy.prepare(q, [ "test" ]) CONTEXT: PL/Python function "invalid_type_uncaught" /* for what it's worth catch the exception generated by * the typo, and return None *** SELECT invalid_type_reraised('rick'); *** 100,105 --- 112,120 WARNING: plpy.SPIError: unrecognized error in PLy_spi_prepare CONTEXT: PL/Python function "invalid_type_reraised" ERROR: plpy.Error: type "test" does not exist + DETAIL: Traceback (most recent call last): + PL/Python function "invalid_type_reraised", line 6, in + plpy.error(str(ex)) CONTEXT: PL/Python function "invalid_type_reraised" /* no typo no messing about */ *** SELECT valid_type('rick'); *** 119,121 --- 134,231 (1 row) + /* error in nested functions to get a traceback + */ + CREATE FUNCTION nested_error() RETURNS text + AS + 'def fun1(): + plpy.error("boom") + + def fun2(): + fun1() + + def fun3(): + fun2() + + fun3() + return "not reached" + ' + LANGUAGE plpythonu; + SELECT nested_error(); + ERROR: plpy.Error: boom + DETAIL: Traceback (most recent call last): + PL/Python function "nested_error", line 10, in + fun3() + PL/Python function "nested_error", line 8, in fun3 + fun2() + PL/Python function "nested_error", line 5, in fun2 + fun1() + PL/Python function "nested_
[HACKERS] pl/python invalidate functions with composite arguments
Here's a patch implementing properly invalidating functions that have composite type arguments after the type changes, as mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the plpython-refactor patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/invalidate-composites. The idea in general is for this to work (taken straight from the unit tests, btw) CREATE TABLE employee ( name text, basesalary integer, bonus integer ); INSERT INTO employee VALUES ('John', 100, 10), ('Mary', 200, 10); CREATE OR REPLACE FUNCTION test_composite_table_input(e employee) RETURNS integer AS $$ return e['basesalary'] + e['bonus'] $$ LANGUAGE plpythonu; SELECT name, test_composite_table_input(employee.*) FROM employee; ALTER TABLE employee DROP bonus; -- this fails SELECT name, test_composite_table_input(employee.*) FROM employee; ALTER TABLE employee ADD bonus integer; UPDATE employee SET bonus = 10; -- this works again SELECT name, test_composite_table_input(employee.*) FROM employee; It's a long-standing TODO item, and a generally good thing to do. Cheers, Jan diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out index 982005b..cf7b7de 100644 *** a/src/pl/plpython/expected/plpython_types.out --- b/src/pl/plpython/expected/plpython_types.out *** SELECT * FROM test_type_conversion_array *** 599,604 --- 599,656 ERROR: return value of function with array return type is not a Python sequence CONTEXT: while creating return value PL/Python function "test_type_conversion_array_error" + --- + --- Composite types + --- + CREATE TABLE employee ( + name text, + basesalary integer, + bonus integer + ); + INSERT INTO employee VALUES ('John', 100, 10), ('Mary', 200, 10); + CREATE OR REPLACE FUNCTION test_composite_table_input(e employee) RETURNS integer AS $$ + return e['basesalary'] + e['bonus'] + $$ LANGUAGE plpythonu; + SELECT name, test_composite_table_input(employee.*) FROM employee; + name | test_composite_table_input + --+ + John |110 + Mary |210 + (2 rows) + + ALTER TABLE employee DROP bonus; + SELECT name, test_composite_table_input(employee.*) FROM employee; + ERROR: KeyError: 'bonus' + CONTEXT: PL/Python function "test_composite_table_input" + ALTER TABLE employee ADD bonus integer; + UPDATE employee SET bonus = 10; + SELECT name, test_composite_table_input(employee.*) FROM employee; + name | test_composite_table_input + --+ + John |110 + Mary |210 + (2 rows) + + CREATE TYPE named_pair AS ( + i integer, + j integer + ); + CREATE OR REPLACE FUNCTION test_composite_type_input(p named_pair) RETURNS integer AS $$ + return sum(p.values()) + $$ LANGUAGE plpythonu; + SELECT test_composite_type_input(row(1, 2)); + test_composite_type_input + --- + 3 + (1 row) + + ALTER TYPE named_pair RENAME TO named_pair_2; + SELECT test_composite_type_input(row(1, 2)); + test_composite_type_input + --- + 3 + (1 row) + -- -- Prepared statements -- diff --git a/src/pl/plpython/expected/plpython_types_3.out b/src/pl/plpython/expected/plpython_types_3.out index eeb23b7..e10b060 100644 *** a/src/pl/plpython/expected/plpython_types_3.out --- b/src/pl/plpython/expected/plpython_types_3.out *** SELECT * FROM test_type_conversion_array *** 599,604 --- 599,656 ERROR: return value of function with array return type is not a Python sequence CONTEXT: while creating return value PL/Python function "test_type_conversion_array_error" + --- + --- Composite types + --- + CREATE TABLE employee ( + name text, + basesalary integer, + bonus integer + ); + INSERT INTO employee VALUES ('John', 100, 10), ('Mary', 200, 10); + CREATE OR REPLACE FUNCTION test_composite_table_input(e employee) RETURNS integer AS $$ + return e['basesalary'] + e['bonus'] + $$ LANGUAGE plpython3u; + SELECT name, test_composite_table_input(employee.*) FROM employee; + name | test_composite_table_input + --+ + John |110 + Mary |210 + (2 rows) + + ALTER TABLE employee DROP bonus; + SELECT name, test_composite_table_input(employee.*) FROM employee; + ERROR: KeyError: 'bonus' + CONTEXT: PL/Python function "test_composite_table_input" + ALTER TABLE employee ADD bonus integer; + UPDATE employee SET bonus = 10; + SELECT name, test_composite_table_input(employee.*) FROM employee; + name | test_composite_table_input + --+ + John |110 + Mary |210 + (2 rows) + + CREATE TYPE named_pair AS ( + i integer, +
[HACKERS] Why is sorting on two columns so slower than sorting on one column?
Hi, Here is the test table, postgres=# \d big_wf Table "public.big_wf" Column | Type | Modifiers +-+--- age| integer | id | integer | postgres=# \dt+ big_wf List of relations Schema | Name | Type | Owner | Size | Description ++---+--++- public | big_wf | table | workshop | 142 MB | The first query sorting on one column: postgres=# explain analyze select * from big_wf order by age; QUERY PLAN - Sort (cost=565525.45..575775.45 rows=410 width=8) (actual time=11228.155..16427.149 rows=410 loops=1) Sort Key: age Sort Method: external sort Disk: 72112kB -> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) (actual time=6.196..4797.620 rows=410 loops=1) Total runtime: 19530.452 ms (5 rows) The second query sorting on two columns: postgres=# explain analyze select * from big_wf order by age,id; QUERY PLAN - Sort (cost=565525.45..575775.45 rows=410 width=8) (actual time=37544.779..48206.702 rows=410 loops=1) Sort Key: age, id Sort Method: external merge Disk: 72048kB -> Seq Scan on big_wf (cost=0.00..59142.00 rows=410 width=8) (actual time=6.796..5518.663 rows=410 loops=1) Total runtime: 51258.000 ms (5 rows) The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the first column(age) of all the tuples are of the same value, so the second column(id) is always needed for comparison. While the first sorting takes about only 6 seconds, the second one takes over 30 seconds, Is this too much than expected? Is there any possible optimization ? Thanks, Li Jie
[HACKERS] pl/python SPI in subtransactions
Here's a patch implementing a executing SPI in an subtransaction mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the plpython-refactor patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/spi-in-subxacts. Without it the error handling in PL/Python is really broken, as we jump between from a saught longjmp back into Python without any cleanup. As an additional bonus you no longer get all the ugly "unrecognized error in PLy_spi_execute_query" errors. Cheers, Jan diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index 70890a8..7fc8337 100644 *** a/src/pl/plpython/expected/plpython_error.out --- b/src/pl/plpython/expected/plpython_error.out *** CREATE FUNCTION sql_syntax_error() RETUR *** 8,15 'plpy.execute("syntax error")' LANGUAGE plpythonu; SELECT sql_syntax_error(); - WARNING: plpy.SPIError: unrecognized error in PLy_spi_execute_query - CONTEXT: PL/Python function "sql_syntax_error" ERROR: plpy.SPIError: syntax error at or near "syntax" CONTEXT: PL/Python function "sql_syntax_error" /* check the handling of uncaught python exceptions --- 8,13 *** CREATE FUNCTION exception_index_invalid_ *** 29,36 return rv[0]' LANGUAGE plpythonu; SELECT exception_index_invalid_nested(); - WARNING: plpy.SPIError: unrecognized error in PLy_spi_execute_query - CONTEXT: PL/Python function "exception_index_invalid_nested" ERROR: plpy.SPIError: function test5(unknown) does not exist CONTEXT: PL/Python function "exception_index_invalid_nested" /* a typo --- 27,32 *** return None *** 47,54 ' LANGUAGE plpythonu; SELECT invalid_type_uncaught('rick'); - WARNING: plpy.SPIError: unrecognized error in PLy_spi_prepare - CONTEXT: PL/Python function "invalid_type_uncaught" ERROR: plpy.SPIError: type "test" does not exist CONTEXT: PL/Python function "invalid_type_uncaught" /* for what it's worth catch the exception generated by --- 43,48 *** return None *** 70,77 ' LANGUAGE plpythonu; SELECT invalid_type_caught('rick'); - WARNING: plpy.SPIError: unrecognized error in PLy_spi_prepare - CONTEXT: PL/Python function "invalid_type_caught" NOTICE: type "test" does not exist CONTEXT: PL/Python function "invalid_type_caught" invalid_type_caught --- 64,69 *** return None *** 97,104 ' LANGUAGE plpythonu; SELECT invalid_type_reraised('rick'); - WARNING: plpy.SPIError: unrecognized error in PLy_spi_prepare - CONTEXT: PL/Python function "invalid_type_reraised" ERROR: plpy.Error: type "test" does not exist CONTEXT: PL/Python function "invalid_type_reraised" /* no typo no messing about --- 89,94 diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 67eb0f3..5b216b2 100644 *** a/src/pl/plpython/plpython.c --- b/src/pl/plpython/plpython.c *** typedef int Py_ssize_t; *** 101,106 --- 101,107 #include "nodes/makefuncs.h" #include "parser/parse_type.h" #include "tcop/tcopprot.h" + #include "access/xact.h" #include "utils/builtins.h" #include "utils/lsyscache.h" #include "utils/memutils.h" *** PLy_spi_prepare(PyObject *self, PyObject *** 2829,2834 --- 2830,2836 int nargs; MemoryContext volatile oldcontext = CurrentMemoryContext; + ResourceOwner volatile oldowner = CurrentResourceOwner; if (!PyArg_ParseTuple(args, "s|O", &query, &list)) { *** PLy_spi_prepare(PyObject *self, PyObject *** 2852,2857 --- 2854,2862 plan->values = PLy_malloc(sizeof(Datum) * nargs); plan->args = PLy_malloc(sizeof(PLyTypeInfo) * nargs); + BeginInternalSubTransaction(NULL); + MemoryContextSwitchTo(oldcontext); + PG_TRY(); { int i; *** PLy_spi_prepare(PyObject *self, PyObject *** 2931,2949 if (plan->plan == NULL) elog(ERROR, "SPI_saveplan failed: %s", SPI_result_code_string(SPI_result)); } PG_CATCH(); { ErrorData *edata; - MemoryContextSwitchTo(oldcontext); - edata = CopyErrorData(); Py_DECREF(plan); Py_XDECREF(optr); ! if (!PyErr_Occurred()) ! PLy_exception_set(PLy_exc_spi_error, ! "unrecognized error in PLy_spi_prepare"); ! PLy_elog(WARNING, NULL); PLy_exception_set(PLy_exc_spi_error, edata->message); return NULL; } --- 2936,2978 if (plan->plan == NULL) elog(ERROR, "SPI_saveplan failed: %s", SPI_result_code_string(SPI_result)); + + /* Commit the inner transaction, return to outer xact context */ + ReleaseCurrentSubTransaction(); + MemoryContextSwitchTo(oldcontext); + CurrentResourceOwner = oldowner; + + /* + * AtEOSubXact_SPI() should not have popped any SPI context, but just + * in case it did, make sure we remain connected. + */ + SPI_restore_connection(); }
Re: [HACKERS] pl/python validator function
On 23/12/10 14:41, Jan Urbański wrote: > Here's a patch implementing a validator functoin mentioned in > http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's > an incremental patch on top of the plpython-refactor patch sent eariler. > > Git branch for this patch: > https://github.com/wulczer/postgres/tree/validator. > > Cheers, > Jan Yes, here's a patch. Jan diff --git a/src/include/catalog/pg_pltemplate.h b/src/include/catalog/pg_pltemplate.h index 6befcfa..2f1d8e1 100644 *** a/src/include/catalog/pg_pltemplate.h --- b/src/include/catalog/pg_pltemplate.h *** DATA(insert ( "pltcl" t t "pltcl_call_h *** 72,79 DATA(insert ( "pltclu" f f "pltclu_call_handler" _null_ _null_ "$libdir/pltcl" _null_ )); DATA(insert ( "plperl" t t "plperl_call_handler" "plperl_inline_handler" "plperl_validator" "$libdir/plperl" _null_ )); DATA(insert ( "plperlu" f f "plperl_call_handler" "plperl_inline_handler" "plperl_validator" "$libdir/plperl" _null_ )); ! DATA(insert ( "plpythonu" f f "plpython_call_handler" "plpython_inline_handler" _null_ "$libdir/plpython" _null_ )); ! DATA(insert ( "plpython2u" f f "plpython_call_handler" "plpython_inline_handler" _null_ "$libdir/plpython2" _null_ )); ! DATA(insert ( "plpython3u" f f "plpython3_call_handler" "plpython3_inline_handler" _null_ "$libdir/plpython3" _null_ )); #endif /* PG_PLTEMPLATE_H */ --- 72,79 DATA(insert ( "pltclu" f f "pltclu_call_handler" _null_ _null_ "$libdir/pltcl" _null_ )); DATA(insert ( "plperl" t t "plperl_call_handler" "plperl_inline_handler" "plperl_validator" "$libdir/plperl" _null_ )); DATA(insert ( "plperlu" f f "plperl_call_handler" "plperl_inline_handler" "plperl_validator" "$libdir/plperl" _null_ )); ! DATA(insert ( "plpythonu" f f "plpython_call_handler" "plpython_inline_handler" "plpython_validator" "$libdir/plpython" _null_ )); ! DATA(insert ( "plpython2u" f f "plpython_call_handler" "plpython_inline_handler" "plpython_validator" "$libdir/plpython2" _null_ )); ! DATA(insert ( "plpython3u" f f "plpython3_call_handler" "plpython3_inline_handler" "plpython3_validator" "$libdir/plpython3" _null_ )); #endif /* PG_PLTEMPLATE_H */ diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index 70890a8..a9e1f15 100644 *** a/src/pl/plpython/expected/plpython_error.out --- b/src/pl/plpython/expected/plpython_error.out *** *** 1,6 --- 1,30 -- test error handling, i forgot to restore Warn_restart in -- the trigger handler once. the errors and subsequent core dump were -- interesting. + /* Flat out Python syntax error + */ + CREATE FUNCTION python_syntax_error() RETURNS text + AS + '.syntaxerror' + LANGUAGE plpythonu; + ERROR: could not compile PL/Python function "python_syntax_error" + DETAIL: SyntaxError: invalid syntax (, line 2) + /* With check_function_bodies = false the function should get defined + * and the error reported when called + */ + SET check_function_bodies = false; + CREATE FUNCTION python_syntax_error() RETURNS text + AS + '.syntaxerror' + LANGUAGE plpythonu; + SELECT python_syntax_error(); + ERROR: could not compile PL/Python function "python_syntax_error" + DETAIL: SyntaxError: invalid syntax (, line 2) + /* Run the function twice to check if the hashtable entry gets cleaned up */ + SELECT python_syntax_error(); + ERROR: could not compile PL/Python function "python_syntax_error" + DETAIL: SyntaxError: invalid syntax (, line 2) + RESET check_function_bodies; /* Flat out syntax error */ CREATE FUNCTION sql_syntax_error() RETURNS text diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out index c8c4f9d..770f764 100644 *** a/src/pl/plpython/expected/plpython_record.out --- b/src/pl/plpython/expected/plpython_record.out *** CREATE FUNCTION test_in_out_params_multi *** 47,52 --- 47,53 second out text, third out text) AS $$ return first + '_record_in_to_out'; $$ LANGUAGE plpythonu; + ERROR: PL/Python functions cannot return type record CREATE FUNCTION test_inout_params(first inout text) AS $$ return first + '_inout'; $$ LANGUAGE plpythonu; *** SELECT * FROM test_in_out_params('test_i *** 299,305 -- this doesn't work yet :-( SELECT * FROM test_in_out_params_multi('test_in'); ! ERROR: PL/Python functions cannot return type record SELECT * FROM test_inout_params('test_in'); first --- --- 300,309 -- this doesn't work yet :-( SELECT * FROM test_in_out_params_multi('test_in'); ! ERROR: function test_in_out_params_multi(unknown) does not exist ! LINE 1: SELECT * FROM test_in_out_params_multi('test_in'); ! ^ ! HINT: No function matches the given name and argument types. You might need to add explicit type casts. SELECT * FROM test_inout_params('test_
[HACKERS] pl/python validator function
Here's a patch implementing a validator functoin mentioned in http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's an incremental patch on top of the plpython-refactor patch sent eariler. Git branch for this patch: https://github.com/wulczer/postgres/tree/validator. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Recovery conflict monitoring
This patch adds counters and views to monitor hot standby generated recovery conflicts. It extends the pg_stat_database view with one column with the total number of conflicts, and also creates a new view pg_stat_database_conflicts that contains a breakdown of exactly what caused the conflicts. Documentation still pending, but comments meanwhile is of course appreciated ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ *** a/src/backend/catalog/system_views.sql --- b/src/backend/catalog/system_views.sql *** *** 502,508 CREATE VIEW pg_stat_database AS pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, ! pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted FROM pg_database D; CREATE VIEW pg_stat_user_functions AS --- 502,521 pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, ! pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted, ! pg_stat_get_db_conflict_all(D.oid) AS conflicts ! FROM pg_database D; ! ! CREATE VIEW pg_stat_database_conflicts AS ! SELECT ! D.oid AS datid, ! D.datname AS datname, ! pg_stat_get_db_conflict_database(D.oid) AS confl_database, ! pg_stat_get_db_conflict_tablespace(D.oid) AS confl_tablespace, ! pg_stat_get_db_conflict_lock(D.oid) AS confl_lock, ! pg_stat_get_db_conflict_snapshot(D.oid) AS confl_snapshot, ! pg_stat_get_db_conflict_bufferpin(D.oid) AS confl_bufferpin, ! pg_stat_get_db_conflict_startup_deadlock(D.oid) AS confl_deadlock FROM pg_database D; CREATE VIEW pg_stat_user_functions AS *** a/src/backend/postmaster/pgstat.c --- b/src/backend/postmaster/pgstat.c *** *** 57,62 --- 57,63 #include "storage/ipc.h" #include "storage/pg_shmem.h" #include "storage/pmsignal.h" + #include "storage/procsignal.h" #include "utils/guc.h" #include "utils/memutils.h" #include "utils/ps_status.h" *** *** 278,283 static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len); --- 279,285 static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len); static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len); static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len); + static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len); /* *** *** 1314,1319 pgstat_report_analyze(Relation rel, bool adopt_counts, --- 1316,1340 pgstat_send(&msg, sizeof(msg)); } + /* + * pgstat_report_recovery_conflict() - + * + * Tell the collector about a Hot Standby recovery conflict. + * + */ + void + pgstat_report_recovery_conflict(int reason) + { + PgStat_MsgRecoveryConflict msg; + + if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts) + return; + + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RECOVERYCONFLICT); + msg.m_databaseid = MyDatabaseId; + msg.m_reason = reason; + pgstat_send(&msg, sizeof(msg)); + } /* -- * pgstat_ping() - *** *** 3053,3058 PgstatCollectorMain(int argc, char *argv[]) --- 3074,3083 pgstat_recv_funcpurge((PgStat_MsgFuncpurge *) &msg, len); break; + case PGSTAT_MTYPE_RECOVERYCONFLICT: + pgstat_recv_recoveryconflict((PgStat_MsgRecoveryConflict *) &msg, len); + break; + default: break; } *** *** 3129,3134 pgstat_get_db_entry(Oid databaseid, bool create) --- 3154,3165 result->n_tuples_updated = 0; result->n_tuples_deleted = 0; result->last_autovac_time = 0; + result->n_conflict_database = 0; + result->n_conflict_tablespace = 0; + result->n_conflict_lock = 0; + result->n_conflict_snapshot = 0; + result->n_conflict_bufferpin = 0; + result->n_conflict_startup_deadlock = 0; memset(&hash_ctl, 0, sizeof(hash_ctl)); hash_ctl.keysize = sizeof(Oid); *** *** 4204,4209 pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len) --- 4235,4275 } /* -- + * pgstat_recv_recoveryconflict() - + * + * Process as RECOVERYCONFLICT message. + * -- + */ + static void + pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len) + { + PgStat_StatDBEntry *dbentry; + dbentry = pgstat_get_db_entry(msg->m_databaseid, true); + + switch (msg->m_reason) + { + case PROCSIG_RECOVERY_CONFLICT_DATABASE: + dbentry->n_conflict_database++; + break; + case PROCSIG_RECOVERY_CONFLICT_TABLESPACE: + dbentry->n_conflict_tablespace++; + break; + case PROCSIG_RECOVERY
Re: [HACKERS] pl/python improvements
On 23/12/10 12:16, Marti Raudsepp wrote: > On Thu, Dec 23, 2010 at 04:08, Jan Urbański wrote: >> * providing custom exceptions for SPI errors, so you can catch only >> UniqueViolations and not have to muck around with SQLCODE > > py-postgresql already has a mapping from error codes to Python > exceptions. I think it makes sense to re-use that, instead of > inventing new names. > https://github.com/jwp/py-postgresql/blob/v1.1/postgresql/exceptions.py > > It also follows the Python convention of ending exception classes with > "Error", so instead of UniqueViolation they have UniqueError, instead > of InvalidTextRepresentation, they have TextRepresentationError Oh, didn't know that. I see that it does some more fancy things, like defining a inheritance hierarchy for these exceptions and adding some more into the mix. The names I used are not really invented, they're just plpgsql condition names from http://www.postgresql.org/docs/current/static/errcodes-appendix.html with underscores changed to camel case. Also, since they're autogenerated from utils/errcodes.h they don't have any hierarchy, they just all inherit from SPIError. Sticking "Error" to every one of them will result in things like SubstringErrorError, so I'm not really sold on that. Basically I think more PL/Python users will be familiar with condition names as you use them in pl/pgsql than with the names from py-postgresql. >> Meanwhile the code >> is available at https://github.com/wulczer/postgres. You will find 10 >> branches there, 9 correspond to these features, and the "plpython" >> branch is the sum of them all. > > I tried building the plpython branch, but got an unrelated error. I > didn't investigate further for now... > > make[3]: Entering directory > `/home/marti/src/postgresql-py/src/backend/bootstrap' > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing > -fwrapv -I. -I. -I../../../src/include -D_GNU_SOURCE -c -o > bootparse.o bootparse.c > bootparse.y: In function ‘boot_yyparse’: > bootparse.y:224:16: error: too few arguments to function ‘heap_create’ > ../../../src/include/catalog/heap.h:37:17: note: declared here > bootparse.y:249:16: error: too few arguments to function > ‘heap_create_with_catalog’ > ../../../src/include/catalog/heap.h:48:12: note: declared here > make[3]: *** [bootparse.o] Error 1 I'm pretty sure it'll go away it you do make maintainer-clean or git clean -dfx after switching to the plpython branch. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python improvements
On Thu, Dec 23, 2010 at 04:08, Jan Urbański wrote: > * providing custom exceptions for SPI errors, so you can catch only > UniqueViolations and not have to muck around with SQLCODE py-postgresql already has a mapping from error codes to Python exceptions. I think it makes sense to re-use that, instead of inventing new names. https://github.com/jwp/py-postgresql/blob/v1.1/postgresql/exceptions.py It also follows the Python convention of ending exception classes with "Error", so instead of UniqueViolation they have UniqueError, instead of InvalidTextRepresentation, they have TextRepresentationError > Meanwhile the code > is available at https://github.com/wulczer/postgres. You will find 10 > branches there, 9 correspond to these features, and the "plpython" > branch is the sum of them all. I tried building the plpython branch, but got an unrelated error. I didn't investigate further for now... make[3]: Entering directory `/home/marti/src/postgresql-py/src/backend/bootstrap' gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -I. -I. -I../../../src/include -D_GNU_SOURCE -c -o bootparse.o bootparse.c bootparse.y: In function ‘boot_yyparse’: bootparse.y:224:16: error: too few arguments to function ‘heap_create’ ../../../src/include/catalog/heap.h:37:17: note: declared here bootparse.y:249:16: error: too few arguments to function ‘heap_create_with_catalog’ ../../../src/include/catalog/heap.h:48:12: note: declared here make[3]: *** [bootparse.o] Error 1 Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Streaming replication as a separate permissions
Here's a patch that changes walsender to require a special privilege for replication instead of relying on superuser permissions. We discussed this back before 9.0 was finalized, but IIRC we ran out of time. The motivation being that you really want to use superuser as little as possible - and since being a replication slave is a read only role, it shouldn't require the maximum permission available in the system. Obviously the patch needs docs and some system views updates, which I will add later. But I wanted to post what I have so far for a quick review to confirm whether I'm on the right track or not... How it works should be rather obvious - adds a "WITH REPLICATION/NOREPLICATION" to the create and alter role commands, and then check this when a connection attempts to start the walsender. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ *** a/src/backend/commands/user.c --- b/src/backend/commands/user.c *** *** 94,99 CreateRole(CreateRoleStmt *stmt) --- 94,100 bool createrole = false; /* Can this user create roles? */ bool createdb = false; /* Can the user create databases? */ bool canlogin = false; /* Can this user login? */ + bool isreplication = false; /* Is this a replication role? */ int connlimit = -1; /* maximum connections allowed */ List *addroleto = NIL; /* roles to make this a member of */ List *rolemembers = NIL; /* roles to be members of this role */ *** *** 107,112 CreateRole(CreateRoleStmt *stmt) --- 108,114 DefElem*dcreaterole = NULL; DefElem*dcreatedb = NULL; DefElem*dcanlogin = NULL; + DefElem *disreplication = NULL; DefElem*dconnlimit = NULL; DefElem*daddroleto = NULL; DefElem*drolemembers = NULL; *** *** 190,195 CreateRole(CreateRoleStmt *stmt) --- 192,205 errmsg("conflicting or redundant options"))); dcanlogin = defel; } + else if (strcmp(defel->defname, "isreplication") == 0) + { + if (disreplication) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("conflicting or redundant options"))); + disreplication = defel; + } else if (strcmp(defel->defname, "connectionlimit") == 0) { if (dconnlimit) *** *** 247,252 CreateRole(CreateRoleStmt *stmt) --- 257,264 createdb = intVal(dcreatedb->arg) != 0; if (dcanlogin) canlogin = intVal(dcanlogin->arg) != 0; + if (disreplication) + isreplication = intVal(disreplication->arg) != 0; if (dconnlimit) { connlimit = intVal(dconnlimit->arg); *** *** 341,346 CreateRole(CreateRoleStmt *stmt) --- 353,359 /* superuser gets catupdate right by default */ new_record[Anum_pg_authid_rolcatupdate - 1] = BoolGetDatum(issuper); new_record[Anum_pg_authid_rolcanlogin - 1] = BoolGetDatum(canlogin); + new_record[Anum_pg_authid_rolreplication - 1] = BoolGetDatum(isreplication); new_record[Anum_pg_authid_rolconnlimit - 1] = Int32GetDatum(connlimit); if (password) *** *** 439,444 AlterRole(AlterRoleStmt *stmt) --- 452,458 int createrole = -1; /* Can this user create roles? */ int createdb = -1; /* Can the user create databases? */ int canlogin = -1; /* Can this user login? */ + int isreplication = -1; /* Is this a replication role? */ int connlimit = -1; /* maximum connections allowed */ List *rolemembers = NIL; /* roles to be added/removed */ char *validUntil = NULL; /* time the login is valid until */ *** *** 450,455 AlterRole(AlterRoleStmt *stmt) --- 464,470 DefElem*dcreaterole = NULL; DefElem*dcreatedb = NULL; DefElem*dcanlogin = NULL; + DefElem *disreplication = NULL; DefElem*dconnlimit = NULL; DefElem*drolemembers = NULL; DefElem*dvalidUntil = NULL; *** *** 514,519 AlterRole(AlterRoleStmt *stmt) --- 529,542 errmsg("conflicting or redundant options"))); dcanlogin = defel; } + else if (strcmp(defel->defname, "isreplication") == 0) + { + if (disreplication) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("conflicting or redundant options"))); + disreplication = defel; + } else if (strcmp(defel->defname, "connectionlimit") == 0) { if (dconnlimit) *** *** 556,561 AlterRole(AlterRoleStmt *stmt) --- 579,586 createdb = intVal(dcreatedb->arg); if (dcanlogin) canlogin = intVal(dcanlogin->arg); + if (disreplication) + isreplication = intVal(disreplication->arg); if (dconnlimit) { connlimit = intVal(dconnlimit->arg); *** *** 600,605 AlterRole(AlterRoleStmt *stmt) --- 625,631 createrole < 0 && createdb < 0 && canlogin < 0 && + isreplication < 0 && !dconnlimit && !rolemembers && !validUntil && ***
[HACKERS] recapitulation: FOREACH-IN-ARRAY
Hello I reread a discus to this topic. It's look like we can to find a agreement on following syntax: FOREACH var [,var [..]] [SLICE number] IN expr LOOP END LOOP; In default mode, without keyword SLICE, it will iterate over every field like "unnest" does. With SLICE keyword, it iterate over nested arrays. Deep can be chosen. I don't see a reason why the deep should be a dynamic. There can be a constant. variable list is used only when item is record or row type. I am not sure what is better keyword - SLICE or SLICING ? Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers