Re: [HACKERS] Patch for fail-back without fresh backup
On Wednesday, June 19, 2013 10:45 PM Sawada Masahiko wrote: On Tuesday, June 18, 2013, Amit Kapila wrote: On Tuesday, June 18, 2013 12:18 AM Sawada Masahiko wrote: On Sun, Jun 16, 2013 at 2:00 PM, Amit kapila amit.kap...@huawei.com wrote: On Saturday, June 15, 2013 8:29 PM Sawada Masahiko wrote: On Sat, Jun 15, 2013 at 10:34 PM, Amit kapila amit.kap...@huawei.com wrote: On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote: On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila amit.kap...@huawei.com wrote: On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote: Hello, I think that we can dumping data before all WAL files deleting. All WAL files deleting is done when old master starts as new standby. Can we dump data without starting server? Sorry I made a mistake. We can't it. this proposing patch need to be able to also handle such scenario in future. I am not sure the purposed patch can handle it so easily, but I think if others also felt it important, then a method should be a provided to user for extracting his last committed data. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add visibility map information to pg_freespace.
On 20 June 2013 04:26, Satoshi Nagayasu sn...@uptime.jp wrote: I'm looking into this patch as a reviewer. (2013/06/19 18:03), Simon Riggs wrote: On 19 June 2013 09:19, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: It should useful in other aspects but it seems a bit complicated just to know about visibility bits for certain blocks. With your current patch you can only see the visibility info for blocks in cache, not for all blocks. So while you may think it is useful, it is also unnecessarily limited in scope. Let's just have something that is easy to use that lets us see the visibility state for a block, not just blocks in freespace. I think we can have this visibility map statistics also in pgstattuple function (as a new column) for this purpose. IMHO, we have several modules for different purposes. - pageinspect provies several functions for debugging purpose. - pg_freespace provies a view for monitoring purpose. - pgstattuple provies several functions for collecting specific table/index statistics. So, we can have similar feature in different modules. Any comments? +1 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote: I think a good starting point would be to use the Intel and IBM libraries to implement basic DECIMAL32/64/128 to see if they perform better than the gcc builtins tested by Pavel by adapting his extension. If the performance isn't interesting it may still be worth adding for compliance reasons, but if we can only add IEEE-compliant decimal FP by using non-SQL-standard type names I don't think that's super useful. I think we should be adding a datatype that is IEEE compliant, even if that doesn't have space and/or performance advantages. We might hope it does, but if not then it may do in the future. It seems almost certain that the SQL standard would adopt the IEEE datatypes in the future. If there are significant performance/space gains to be had, we could consider introducing DECIMAL32/64/128 types with the same names used by DB2, so people could explicitly choose to use them where appropriate. Typenames are easily setup if compatibility is required, so thats not a problem. We'd want to use the name the SQL std people assign. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change authentication error message (patch)
On 06/20/2013 12:51 AM, Jeff Janes wrote: I think we need to keep the first password. Password authentication is a single thing, it is the authentication method attempted. It is the password method (which includes MD5) which failed, as opposed to the LDAP method or the Peer method or one of the other methods. That's against the rule of not revealing any more knowledge than a potential attacker already has, no? For that reason, I'd rather go with just authentication failed. Without this level of explicitness, it might be hard to figure out which row in pg_hba.conf was the one that PostgreSQL glommed onto to use for authentication. As argued before, that should go into the logs for diagnosis by the sysadmin, but should not be revealed to an attacker. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote: I think a good starting point would be to use the Intel and IBM libraries to implement basic DECIMAL32/64/128 to see if they perform better than the gcc builtins tested by Pavel by adapting his extension. Just a few notes: Not sure if this has already been mentioned, but GCC is using the IBM decNumber library to implement those built-ins so the performance should be nearly identical. Unfortunately, many GCC builds shipped by Linux distributions don't actually seem to have those built-ins configured anyway! Also, the IBM 'xlc' compiler supports those built-ins (IBM being behind all of this stuff...), and generates code using hardware instructions for POWER6/POWER7, or software otherwise (quite possibly the same code again). One further (undeveloped) thought: the IBM decNumber library doesn't just support the 754-2008 types, it also supports a more general decNumber type with arbitrary precision (well, up to 999,999,999 significant figures), so if it were to finish up being used by core PG then it could also have other uses. I have no idea how decNumber (which encodes significant figures in an integer coefficient, so one decimal digit per 3.2(?) bits) compares to PG's DECIMAL (which encodes each digit in 4 bits, BCD style), in terms of arithmetic performance and other trade-offs. If the performance isn't interesting it may still be worth adding for compliance reasons, but if we can only add IEEE-compliant decimal FP by using non-SQL-standard type names I don't think that's super useful. If there are significant performance/space gains to be had, we could consider introducing DECIMAL32/64/128 types with the same names used by DB2, so people could explicitly choose to use them where appropriate. +1 for using the DB2 names. I am interested in this topic as a user of both Postgres and DB2, and an early adopter of 754-2008 in various software. Actually I had started working on my own DECFLOAT types for Postgres using decNumber in 2010 as I mentioned on one of the lists, but life got in the way. I had a very basic extension sort of working though, and core support didn't seem necessary, although I hadn't started on what I considered to be the difficult bit, interactions with the other numerical types (ie deciding which conversions and promotions would make sense and be safe). Finally, I recently ran into a 3rd software implementation of 754-2008: libmpdec (the other two being IBM decNumber and Intel's library), but I haven't looked into it yet. Thomas Munro
Re: [HACKERS] Patch for removng unused targets
From: Hitoshi Harada [mailto:umi.tan...@gmail.com] I guess the patch works fine, but what I'm saying is it might be limited to small use cases. Another instance of this that I can think of is ORDER BY clause of window specifications, which you may want to remove from the target list as well, in addition to ORDER BY of query. It will just not be removed by this approach, simply because it is looking at only parse-sortClause. Certainly you can add more rules to the new function to look at the window specification, but then I'm not sure what we are missing. Yeah, I thought the extension to the window ORDER BY case, too. But I'm not sure it's worth complicating the code, considering that the objective of this optimization is to improve full-text search related things if I understand correctly, though general solutions would be desirable as you mentioned. So, as it stands it doesn't have critical issue, but more generalized approach would be desirable. That said, I don't have strong objection to the current patch, and just posting one thought to see if others may have the same opinion. OK. I'll also wait for others' comments. For review, an updated version of the patch is attached, which fixed the bug using the approach that directly uses the clause information in the parse tree. Thanks, Best regards, Etsuro Fujit unused-targets-20130620.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LEFT JOIN LATERAL can remove rows from LHS
On 06/18/2013 01:52 AM, Jeremy Evans wrote: Maybe I am misunderstanding how LATERAL is supposed to work, but my expectation is that doing a LEFT JOIN should not remove rows from the LHS. I have added this to the list of 9.3 blockers. https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
On 20 June 2013 08:05, Thomas Munro mu...@ip9.org wrote: On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote: If the performance isn't interesting it may still be worth adding for compliance reasons, but if we can only add IEEE-compliant decimal FP by using non-SQL-standard type names I don't think that's super useful. If there are significant performance/space gains to be had, we could consider introducing DECIMAL32/64/128 types with the same names used by DB2, so people could explicitly choose to use them where appropriate. +1 for using the DB2 names. On reflection, I should offer more than +1. I think that the IBM name DECFLOAT(16) is better than DECIMAL64 because: 1) The number of significant decimal digits is probably of greater importance to a typical end user than the number of binary digits used to store it. 2) Other SQL types are parameterised with this notation, such as VARCHAR(6) and DECIMAL(6, 2). 3) IEEE 754 has rather different semantics to SQL DECIMAL, I'm thinking mainly of the behaviour of special values, so using a name like DECFLOAT(n) instead of DECIMAL64 would draw greater attention to that fact (ie it's not just a fixed sized DECIMAL). Also, IBM was here first, and I *guess* they will propose DECFLOAT for standardisation (they are behind proposals to add support to many other languages), though I have no information on that.
Re: [HACKERS] Implementing incremental backup
On Thu, Jun 20, 2013 at 12:18 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Claudio Freire escribió: On Wed, Jun 19, 2013 at 6:20 PM, Stephen Frost sfr...@snowman.net wrote: * Claudio Freire (klaussfre...@gmail.com) wrote: I don't see how this is better than snapshotting at the filesystem level. I have no experience with TB scale databases (I've been limited to only hundreds of GB), but from my limited mid-size db experience, filesystem snapshotting is pretty much the same thing you propose there (xfs_freeze), and it works pretty well. There's even automated tools to do that, like bacula, and they can handle incremental snapshots. Large databases tend to have multiple filesystems and getting a single, consistent, snapshot across all of them while under load is.. 'challenging'. It's fine if you use pg_start/stop_backup() and you're saving the XLOGs off, but if you can't do that.. Good point there. I still don't like the idea of having to mark each modified page. The WAL compressor idea sounds a lot more workable. As in scalable. There was a project that removed useless WAL records from the stream, to make it smaller and useful for long-term archiving. It only removed FPIs as far as I recall. It's dead now, and didn't compile on recent (9.1?) Postgres because of changes in the WAL structs, IIRC. This doesn't help if you have a large lot of UPDATEs that touch the same set of rows over and over, though. Tatsuo-san's proposal would allow this use-case to work nicely because you only keep one copy of such data, not one for each modification. If you have the two technologies, you could teach them to work in conjunction: you set up WAL replication, and tell the WAL compressor to prune updates for high-update tables (avoid useless traffic), then use incremental backup to back these up. This seems like it would have a lot of moving parts and be rather bug-prone, though. Just as a datapoint, I think this is basically what at least some other database engine (sqlserver) calls incremental vs differential backup. Differential backup keep tracks of which blocks have changed (by one way or another - maybe as simple as the LSN, but it doesn't matter how, really) and backs up just those blocks (diffed back to the base backup). Incremental does the transaction log, which is basically what we do with log archiving except it's not done in realtime - it's all saved on the master until the backup command runs. Of course, it's quite a been a few years since I set up one of those in anger, so disclaimer for that info being out of date :) Didn't pg_rman try to do something based on the page LSN to achieve something similar to this? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] event trigger API documentation?
Peter Eisentraut pete...@gmx.net writes: Random observation in this general area: Regular triggers provide the field Trigger *tg_trigger in the trigger data, which allows you to get the trigger name, OID, and such. Event triggers don't expose anything comparable. That should perhaps be added. Agreed. Basically we ran out of time to add in any sort of useful information, so that's all 9.4 material. Also, as I'm maybe about the fourth person ever to write an actual event trigger, I have a usability report of sorts. I found it confusing that the trigger timing is encoded into the event name. So instead of ON ddl_command_start, I was expecting something more like BEFORE ddl_command. There might be a reason for this design choice, but I found it a confusing departure from known trigger concepts. Your proposal here matches what I did propose in the 9.2 cycle. I even wanted to go as far as having BEFORE, AFTER and INSTEAD OF command triggers. The problem was to find the right place in the code for each different command, and while I did work on that and proposed command specific integration points, Robert had the idea of having something more flexible and not tied too much with commands, hence events. The idea is to be able to provide events such as table_rewrite or insert_into_unknown_table etc. How we got from that decision to prefer ddl_command_start to BEFORE ddl_command still is unclear to me. We could have kept the grammar and turned it internally into the before_ddl_command event. But then certainly you want to be able to say BEFORE CREATE TABLE, ALTER TABLE or BEFORE ANY EVENT, things like that. In the patches I sent containing that kind of implementation, it was said to be too much grammar to maintain, as the patch I had needed to list here all supported commands, and each time to add support for a new one you needed to edit that grammar list… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change authentication error message (patch)
On 20/06/2013 08:47, Markus Wanner wrote: On 06/20/2013 12:51 AM, Jeff Janes wrote: I think we need to keep the first password. Password authentication is a single thing, it is the authentication method attempted. It is the password method (which includes MD5) which failed, as opposed to the LDAP method or the Peer method or one of the other methods. That's against the rule of not revealing any more knowledge than a potential attacker already has, no? For that reason, I'd rather go with just authentication failed. My understanding is that the attacker would already have that information since the server would have sent an AuthenticationMD5Password message to get to the error in the first place. And we still reveal the authentication method to the frontend in all other cases (peer authentication failed, for example). Without this level of explicitness, it might be hard to figure out which row in pg_hba.conf was the one that PostgreSQL glommed onto to use for authentication. As argued before, that should go into the logs for diagnosis by the sysadmin, but should not be revealed to an attacker. Isn't the point of this patch exactly that we didn't want to go down that road? I.e. password authentication failed didn't say that the password might've expired, but some people thought just logging a WARNING/LOG wasn't enough. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Config reload/restart preview
Hi, I've noticed that there's no easy way of checking which settings will change if the config is reloaded, and I think not being able to do this can cause some unfortunate problems. For example, a customer went to change their configuration, just setting log_autovacuum_min_duration to about 20 seconds, and reloaded the server. However, the log file swelled to over 5GB in size before they realised something was wrong, and then reverted the change. It transpired that the reload also pulled in a log_statements change from 'ddl' to 'all' that someone must have changed at some point without applying it. Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? For example: pg_ctl previewconfig SIGHUP: log_statements will change from 'ddl' to 'all' SIGHUP: log_vacuum_min_duration will change from -1 to 2 POSTMASTER: fsync will change from 'on' to 'off' I'm not proposing this specifically, but something that would provide such information. -- Thom
Re: [HACKERS] Bugfix and new feature for PGXS
Le jeudi 20 juin 2013 05:26:21, Peter Eisentraut a écrit : On Wed, 2013-06-19 at 20:58 +0200, Cédric Villemain wrote: I believe he answered the proposal to put all headers on the same flat directory, instead of a tree. The headers are used as #include hstore.h #include ltree.h etc. in the existing source code. If you want to install the for use by others, you can do one of three things: 1) Install them into $(pg_config --includedir-server), so other users will just include them in the same way as shown above. I don't like this one because extension can overwrite sever headers. 2) Install them in a different directory, but keep the same #include lines. That would require PGXS changes, perhaps a new pg_config option, or something that produces the right -I option to find them. Patch of PGXS is not a problem. pg_config patch is a requirement only if users set their own $(includedir_contrib) variable. I didn't though about it, but it is probably better to add that. This looks trivial too. To include hstore header we write «#include hstore.h» but we add : -I$(includedir_contrib)/hstore to the FLAGS in the extension makefile which does require hstore. It changes nothing to hstore Makefile itself. The main difference is that before we had to -I$(top_srcdir)/../contrib/hstore *iff* we are not building with PGXS (else we need to have the hstore source available somewhere), now we can do -I$(includedir_contrib)/hstore with PGXS (hstore need to be installed first, as we USE_PGXS) Then PGXS offers to catch both case transparently so we can do : -I${include_contrib_header) in Makefile and pgxs.mk takes care of adjusting include_contrib_header (or whatever its name) according to USE_PGXS value. 3) Install them in a different directory and require a different #include line. But then you have to change the existing uses as well, which would probably require moving files around. Having to change existing source code do keep the same behavior is not attractive at all. Both 2 and 3 require a lot of work, possibly compatibility breaks, for no obvious reason. 2 is a good solution and not a lot of work, IMO. Removing the need of setting -I$(include...) in the contrib Makefile can be done later by adding a PGXS variable to define the contrib requirements, this is something different from the current feature (build contrib depending on another(s) without full source tree) -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Config reload/restart preview
On Thu, Jun 20, 2013 at 1:04 PM, Thom Brown t...@linux.com wrote: Hi, I've noticed that there's no easy way of checking which settings will change if the config is reloaded, and I think not being able to do this can cause some unfortunate problems. For example, a customer went to change their configuration, just setting log_autovacuum_min_duration to about 20 seconds, and reloaded the server. However, the log file swelled to over 5GB in size before they realised something was wrong, and then reverted the change. It transpired that the reload also pulled in a log_statements change from 'ddl' to 'all' that someone must have changed at some point without applying it. Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? For example: pg_ctl previewconfig SIGHUP: log_statements will change from 'ddl' to 'all' SIGHUP: log_vacuum_min_duration will change from -1 to 2 POSTMASTER: fsync will change from 'on' to 'off' I'm not proposing this specifically, but something that would provide such information. Yes, we should. This would go well with something I started working on some time ago (but haven't actually gotten far on at all), which is the ability for pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload should also be able to tell you which parameters were changed, without having to go to the log. Obviously that's almost exactly the same feature. The problem today is that pg_ctl just sends off a SIGHUP when it does a reload. We'd have to give it an actual interface that could return data back as well, such as a socket of some kind. So it does take some work to come up with. But I definitely think we should have something like this. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change authentication error message (patch)
On 06/20/2013 12:27 PM, Marko Tiikkaja wrote: My understanding is that the attacker would already have that information since the server would have sent an AuthenticationMD5Password message to get to the error in the first place. And we still reveal the authentication method to the frontend in all other cases (peer authentication failed, for example). Oh, right, I wasn't aware of that. Never mind, then. +1 for keeping it mention password authentication explicitly. However, thinking about this a bit more: Other authentication methods may also provide password (or even account) expiration times. And may fail to authenticate a user for entirely different reasons. Given that, I wonder if password expired is such a special case worth mentioning in case of the password auth method. If we go down that path, don't we also have to include auth server unreachable as a possible cause for authentication failure for methods that use an external server? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Config reload/restart preview
On Thu, Jun 20, 2013 at 4:34 PM, Thom Brown t...@linux.com wrote: Hi, I've noticed that there's no easy way of checking which settings will change if the config is reloaded, and I think not being able to do this can cause some unfortunate problems. For example, a customer went to change their configuration, just setting log_autovacuum_min_duration to about 20 seconds, and reloaded the server. However, the log file swelled to over 5GB in size before they realised something was wrong, and then reverted the change. It transpired that the reload also pulled in a log_statements change from 'ddl' to 'all' that someone must have changed at some point without applying it. Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? For example: pg_ctl previewconfig SIGHUP: log_statements will change from 'ddl' to 'all' SIGHUP: log_vacuum_min_duration will change from -1 to 2 POSTMASTER: fsync will change from 'on' to 'off' I'm not proposing this specifically, but something that would provide such information. May be we can have a nice little utility which can show configuration diff between two running servers, or a running server and its modified conf file ? Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] Bugfix and new feature for PGXS
Opinions all? * Give up on being able to use one extension's header from another entirely, except in the special case of in-tree builds There are already a good number of use cases with only hstore and intarray, I'm in favor of this feature. * Install install-enabled contrib headers into an include/contrib/ that's always on the pgxs include path, so you can still just #include hstore.h . For in-tree builds, explicit add other modules' contrib dirs as Peter has done in the proposed plperl_hstore and plpython_hstore. (Use unqualified names). I don't like the idea to share a flat directory with different header files, filenames can overlap. * Install install-enabled contrib headers to include/contrib/[modulename]/ . Within the module, use the unqualified header name. When referring to it from outside the module, use #include contrib/modulename/header.h. Add $(top_srcdir) to the include path when building extensions in-tree. not either, see my other mail. we still #include hstore.h when we want, we just add that the header so it is available for PGXS builds. Contrib Makefile still need to -I$(includedir_contrib)/hstore. What's new is that the header is installed, not that we don't have to set FLAGS. Personally, I'm all for just using the local path when building the module, and the qualified path elsewhere. It requires only a relatively simple change, and I don't think that using hstore.h within hstore, and contrib/hstore/hstore.h elsewhere is of great concern. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Config reload/restart preview
On 20 June 2013 13:16, Pavan Deolasee pavan.deola...@gmail.com wrote: On Thu, Jun 20, 2013 at 4:34 PM, Thom Brown t...@linux.com wrote: Hi, I've noticed that there's no easy way of checking which settings will change if the config is reloaded, and I think not being able to do this can cause some unfortunate problems. For example, a customer went to change their configuration, just setting log_autovacuum_min_duration to about 20 seconds, and reloaded the server. However, the log file swelled to over 5GB in size before they realised something was wrong, and then reverted the change. It transpired that the reload also pulled in a log_statements change from 'ddl' to 'all' that someone must have changed at some point without applying it. Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? For example: pg_ctl previewconfig SIGHUP: log_statements will change from 'ddl' to 'all' SIGHUP: log_vacuum_min_duration will change from -1 to 2 POSTMASTER: fsync will change from 'on' to 'off' I'm not proposing this specifically, but something that would provide such information. May be we can have a nice little utility which can show configuration diff between two running servers, or a running server and its modified conf file ? Well checking for configuration differences between 2 running servers is kind of a separate feature, and I'm not sure it's going to be needed by that many DBAs. If they wanted to see differences, they could just create a foreign table to the other and do an anti join. However, comparing the difference between a running server and its modified conf file is the case I'm describing. I'd personally prefer not to have a whole new utility if it can be avoided. Magnus mentioned making the necessary changes to pg_ctl so that it can provide unlogged feedback, and pg_ctl does feel more like the go-to tool for such a feature. Apache has apache2ctl which is responsible for starting, stopping and restarting the server, but also provides a configtest option to check that a restart won't bring the service down. This seems to be a feature in the same kind of area. In fact that's pretty much the only action apache2ctl does that pg_ctl doesn't do. -- Thom
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
First, thank you for the review. From: Alvaro Herrera alvhe...@2ndquadrant.com This seems reasonable. Why 10 seconds? We could wait 5 seconds, or 15. Is there a rationale behind the 10? If we said 60, that would fit perfectly well within the already existing 60-second loop in postmaster, but that seems way too long. There is no good rationale. I arbitrarily chose a short period because this is immediate shutdown. I felt more than 10 second was long. I think 5 second may be better. Although not directly related to this fix, these influenced my choice: 1. According to the man page of init, init sends SIGKILL to all remaining processes 5 seconds after it sends SIGTERM to them. 2. At computer shutdown, Windows proceeds shutdown forcibly after waiting for services to terminate 20 seconds. I have only one concern about this patch, which is visible in the documentation proposed change: para This is the firsttermImmediate Shutdown/firstterm mode. The master commandpostgres/command process will send a - systemitemSIGQUIT/systemitem to all child processes and exit - immediately, without properly shutting itself down. The child processes - likewise exit immediately upon receiving - systemitemSIGQUIT/systemitem. This will lead to recovery (by + systemitemSIGQUIT/systemitem to all child processes, wait for + them to terminate, and exit. The child processes + exit immediately upon receiving + systemitemSIGQUIT/systemitem. If any of the child processes + does not terminate within 10 seconds for some unexpected reason, + the master postgres process will send a systemitemSIGKILL/systemitem + to all remaining ones, wait for their termination + again, and exit. This will lead to recovery (by replaying the WAL log) upon next start-up. This is recommended only in emergencies. /para Note that the previous text said that postmaster will send SIGQUIT, then terminate without checking anything. In the new code, postmaster sends SIGQUIT, then waits, then SIGKILL, then waits again. If there's an unkillable process (say because it's stuck in a noninterruptible sleep) postmaster might never exit. I think it should send SIGQUIT, then wait, then SIGKILL, then exit without checking. At first I thought the same, but decided not to do that. The purpose of this patch is to make the immediate shutdown reliable. Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. Anyway, in HA cluster, the clusterware will terminate the node with STONITH, not leaving pg_ctl running forever. I have tweaked the patch a bit and I'm about to commit as soon as we resolve the above two items. I appreciate your tweaking, especially in the documentation and source code comments, as English is not my mother tongue. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements
On Thu, Jun 20, 2013 at 1:52 AM, Amit Langote amitlangot...@gmail.com wrote: Is it possible to: CREATE [ OR REPLACE | IF NOT EXISTS ] OPERATOR CLASS I am in a situation where I need to conditionally create an operator class (that is, create only if already does not exist). [...] The intention is cover all CREATE OPERATOR variants. See my planning [1]. Regards, [1] https://docs.google.com/spreadsheet/ccc?key=0Ai7oCVcVQiKFdEctQUxNNlR1R2xRTUpJNFNDcFo4MUEusp=sharing -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] Bugfix and new feature for PGXS
On 6/20/13 1:21 AM, Craig Ringer wrote: As you note, the other option is to just refer to extension headers by their unqualified name. I'm *really* not a fan of that idea due to the obvious clash possibilities with Pg's own headers or system headers, especially given that the whole idea of extensions is that they're user supplied. Not having any kind of namespacing is worrying. We already install all PostgreSQL server header files into a separate directory, so the only clashes you have to worry about are with other extensions and with the backend. And you have to do that anyway, because you will have namespace issues for the shared libraries, the symbols in those libraries, and the extension names. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugfix and new feature for PGXS
On 6/20/13 1:21 AM, Craig Ringer wrote: Personally, I'm all for just using the local path when building the module, and the qualified path elsewhere. It requires only a relatively simple change, and I don't think that using hstore.h within hstore, and contrib/hstore/hstore.h elsewhere is of great concern. It doesn't work if those header files include other header files (as in the case of plpython, for example). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Config reload/restart preview
Magnus Hagander mag...@hagander.net writes: Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? Yes, we should. +1 This would go well with something I started working on some time ago (but haven't actually gotten far on at all), which is the ability for pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload should also be able to tell you which parameters were changed, without having to go to the log. Obviously that's almost exactly the same feature. It could probably connect to the server and issue the SQL command to reload, and that one could probably get enhanced to return modified variable as NOTICE, or be run with the right client_min_message: SELECT pg_reload_conf(); The pg_ctl client would then have to know to display the messages sent back by the server. Then back to what Thom actually is asking, I guess that implementing a SRF that parses the configuration and return something very much like what we get from pg_settings should be possible, and then it's a matter of doing an anti-join, as already proposed. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements
On Thu, Jun 20, 2013 at 9:48 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Thu, Jun 20, 2013 at 1:52 AM, Amit Langote amitlangot...@gmail.com wrote: Is it possible to: CREATE [ OR REPLACE | IF NOT EXISTS ] OPERATOR CLASS I am in a situation where I need to conditionally create an operator class (that is, create only if already does not exist). [...] The intention is cover all CREATE OPERATOR variants. See my planning [1]. Regards, [1] https://docs.google.com/spreadsheet/ccc?key=0Ai7oCVcVQiKFdEctQUxNNlR1R2xRTUpJNFNDcFo4MUEusp=sharing Hmm, okay. Last time I checked, the CREATE OPERATOR CLASS row was empty, so asked. -- Amit Langote -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add long options to pgbench (submission 1)
Please fix that and re-send the patch. Find attached diff wrt current master. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 1303217..f39fed3 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -2109,6 +2109,30 @@ int main(int argc, char **argv) { static struct option long_options[] = { + /* systematic long/short named options*/ + {client, required_argument, NULL, 'c'}, + {connect, no_argument, NULL, 'C'}, + {debug, no_argument, NULL, 'd'}, + {define, required_argument, NULL, 'D'}, + {file, required_argument, NULL, 'f'}, + {fill-factor, required_argument, NULL, 'F'}, + {host, required_argument, NULL, 'h'}, + {initialize, no_argument, NULL, 'i'}, + {jobs, required_argument, NULL, 'j'}, + {log, no_argument, NULL, 'l'}, + {no-vacuum, no_argument, NULL, 'n'}, + {port, required_argument, NULL, 'p'}, + {query-mode, required_argument, NULL, 'M'}, + {quiet-log, no_argument, NULL, 'q'}, + {report-latencies, no_argument, NULL, 'r'}, + {scale, required_argument, NULL, 's'}, + {select-only, no_argument, NULL, 'S'}, + {skip-some-update, no_argument, NULL, 'N'}, + {time, required_argument, NULL, 'T'}, + {transactions, required_argument, NULL, 't'}, + {username, required_argument, NULL, 'U'}, + {vacuum-all, no_argument, NULL, 'v'}, + /* long-named only options */ {foreign-keys, no_argument, foreign_keys, 1}, {index-tablespace, required_argument, NULL, 3}, {tablespace, required_argument, NULL, 2}, diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml index 8775606..3394b93 100644 --- a/doc/src/sgml/pgbench.sgml +++ b/doc/src/sgml/pgbench.sgml @@ -150,6 +150,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-i/option/term + termoption--initialize/option/term listitem para Required to invoke initialization mode. @@ -159,6 +160,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-F/option replaceablefillfactor//term + termoption--fill-factor/option replaceablefillfactor//term listitem para Create the structnamepgbench_accounts/, @@ -171,6 +173,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-n/option/term + termoption--no-vacuum/option/term listitem para Perform no vacuuming after initialization. @@ -180,6 +183,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-q/option/term + termoption--quiet-log/option/term listitem para Switch logging to quiet mode, producing only one progress message per 5 @@ -191,6 +195,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-s/option replaceablescale_factor//term + termoption--scale/option replaceablescale_factor//term listitem para Multiply the number of rows generated by the scale factor. @@ -259,6 +264,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-c/option replaceableclients//term + termoption--client/option replaceableclients//term listitem para Number of clients simulated, that is, number of concurrent database @@ -269,6 +275,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-C/option/term + termoption--connect/option/term listitem para Establish a new connection for each transaction, rather than @@ -280,6 +287,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-d/option/term + termoption--debug/option/term listitem para Print debugging output. @@ -289,6 +297,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-D/option replaceablevarname/literal=/replaceablevalue//term + termoption--define/option replaceablevarname/literal=/replaceablevalue//term listitem para Define a variable for use by a custom script (see below). @@ -299,6 +308,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-f/option replaceablefilename//term + termoption--file/option replaceablefilename//term listitem para Read transaction script from replaceablefilename/. @@ -311,6 +321,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ varlistentry termoption-j/option replaceablethreads//term + termoption--jobs/option replaceablethreads//term listitem para Number of worker threads within applicationpgbench/application. @@ -324,6 +335,7 @@ pgbench optional replaceableoptions/ /optional
Re: [HACKERS] [PATCH] add long options to pgbench (submission 1)
On Thu, Jun 20, 2013 at 9:59 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: Please fix that and re-send the patch. Find attached diff wrt current master. Thanks. -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] Config reload/restart preview
On Thu, Jun 20, 2013 at 2:54 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? Yes, we should. +1 This would go well with something I started working on some time ago (but haven't actually gotten far on at all), which is the ability for pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload should also be able to tell you which parameters were changed, without having to go to the log. Obviously that's almost exactly the same feature. It could probably connect to the server and issue the SQL command to reload, and that one could probably get enhanced to return modified variable as NOTICE, or be run with the right client_min_message: SELECT pg_reload_conf(); The pg_ctl client would then have to know to display the messages sent back by the server. The problem with that is that now you must *always* have your system set up to allow the postgres user to log in in pg_hba.conf or it fails. But yes, one option would be to use SQL instead of opening a socket. Maybe that's a better idea - have pg_ctl try to use that if available, and if not send a regular signal and not try to collect the output. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MVCC catalog access
On Mon, Jun 17, 2013 at 8:12 AM, Andres Freund and...@2ndquadrant.com wrote: So, the biggest issue with the patch seems to be performance worries. I tried to create a worst case scenario: postgres (patched and HEAD) running with: -c shared_buffers=4GB \ -c max_connections=2000 \ -c maintenance_work_mem=2GB \ -c checkpoint_segments=300 \ -c wal_buffers=64MB \ -c synchronous_commit=off \ -c autovacuum=off \ -p 5440 With one background pgbench running: pgbench -p 5440 -h /tmp -f /tmp/readonly-busy.sql -c 1000 -j 10 -T 100 postgres readonly-busy.sql: BEGIN; SELECT txid_current(); SELECT pg_sleep(0.0001); COMMIT; I measured the performance of one other pgbench: pgbench -h /tmp -p 5440 postgres -T 10 -c 100 -j 100 -n -f /tmp/simplequery.sql -C simplequery.sql: SELECT * FROM af1, af2 WHERE af1.x = af2.x; tables: create table af1 (x) as select g from generate_series(1,4) g; create table af2 (x) as select g from generate_series(4,7) g; With that setup one can create quite a noticeable overhead for the mvcc patch (best of 5): master-optimize: tps = 1261.629474 (including connections establishing) tps = 15121.648834 (excluding connections establishing) dev-optimize: tps = 773.719637 (including connections establishing) tps = 2804.239979 (excluding connections establishing) Most of the time in both, patched and unpatched is by far spent in GetSnapshotData. I think the reason this shows a far higher overhead than what you previously measured is that a) in your test the other backends were idle, in mine they actually modify PGXACT which causes noticeable cacheline bouncing b) I have higher numer of connections #max_connections A quick test shows that even with max_connection=600, 400 background, and 100 foreground pgbenches there's noticeable overhead: master-optimize: tps = 2221.226711 (including connections establishing) tps = 31203.259472 (excluding connections establishing) dev-optimize: tps = 1629.734352 (including connections establishing) tps = 4754.449726 (excluding connections establishing) Now I grant that's a somewhat harsh test for postgres, but I don't think it's entirely unreasonable and the performance impact is quite stark. It's not entirely unreasonable, but it *is* mostly unreasonable. I mean, nobody is going to run 1000 connections in the background that do nothing but thrash PGXACT on a real system. I just can't get concerned about that. What I am concerned about is that there may be other, more realistic workloads that show similar regressions. But I don't know how to find out whether that's actually the case. On the IBM POWER box where I tested this, it's not even GetSnapshotData() that kills you; it's the system CPU scheduler. The thing about this particular test is that it's artificial - normally, any operation that wants to modify PGXACT will spend a lot more time fighting of WALInsertLock and maybe waiting for disk I/O than is the case here. Of course, with Heikki's WAL scaling patch and perhaps other optimizations we expect that other overhead to go down, which might make the problems here more visible; and some of Heikki's existing testing has shown significant contention around ProcArrayLock as things stand. But I'm still on the fence about whether this is really a valid test. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic background workers
On Mon, Jun 17, 2013 at 10:45 PM, Peter Eisentraut pete...@gmx.net wrote: On Fri, 2013-06-14 at 17:00 -0400, Robert Haas wrote: Alvaro's work on 9.3, we now have the ability to configure background workers via shared_preload_libraries. But if you don't have the right library loaded at startup time, and subsequently wish to add a background worker while the server is running, you are out of luck. We could tweak shared_preload_libraries so that it reacts sensibly to reloads. I basically gave up on that by writing session_preload_libraries, but if there is more general use for that, we could try. (That doesn't invalidate your work, but it's a thought.) Yeah, I thought about that. But it doesn't seem possible to do anything all that sane. You can't unload libraries if they've been removed; you can potentially load new ones if they've been added. But that's a bit confusing, if the config file says that's what's loaded is bar, and what's actually loaded is foo, bar, baz, bletch, and quux. Some variant of this might still be worth doing, but figuring out the details sounded like more than I wanted to get into, so I punted. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic background workers
Robert, On 06/14/2013 11:00 PM, Robert Haas wrote: Parallel query, or any subset of that project such as parallel sort, will require a way to start background workers on demand. thanks for continuing this, very much appreciated. Postgres-R and thus TransLattice successfully use a similar approach for years, now. I only had a quick glance over the patch, yet. Some comments on the design: This requires some communication channel from ordinary workers to the postmaster, because it is the postmaster that must ultimately start the newly-registered workers. However, that communication channel has to be designed pretty carefully, lest a shared memory corruption take out the postmaster and lead to inadvertent failure to restart after a crash. Here's how I implemented that: there's an array in shared memory of a size equal to max_worker_processes. This array is separate from the backend-private list of workers maintained by the postmaster, but the two are kept in sync. When a new background worker registration is added to the shared data structure, the backend adding it uses the existing pmsignal mechanism to kick the postmaster, which then scans the array for new registrations. That sounds like a good simplification. Even if it's an O(n) operation, the n in question here has relatively low practical limits. It's unlikely to be much of a concern, I guess. Back then, I solved it by having a fork request slot. After starting, the bgworker then had to clear that slot and register with a coordinator process (i.e. the original requestor), so that one learned its fork request was successful. At some point I expanded that to multiple request slots to better handle multiple concurrent fork requests. However, it was difficult to get right and requires more IPC than your approach. On the pro side: The shared memory area used by the postmaster was very small in size and read-only to the postmaster. These were my main goals, which I'm not sure were the best ones, now that I read your concept. I have attempted to make the code that transfers the shared_memory state into the postmaster's private state as paranoid as humanly possible. The precautions taken are documented in the comments. Conversely, when a background worker flagged as BGW_NEVER_RESTART is considered for restart (and we decide against it), the corresponding slot in the shared memory array is marked as no longer in use, allowing it to be reused for a new registration. Sounds like the postmaster is writing to shared memory. Not sure why I've been trying so hard to avoid that, though. After all, it can hardly hurt itself *writing* to shared memory. Since the postmaster cannot take locks, synchronization between the postmaster and other backends using the shared memory segment has to be lockless. This mechanism is also documented in the comments. An lwlock is used to prevent two backends that are both registering a new worker at about the same time from stomping on each other, but the postmaster need not care about that lwlock. This patch also extends worker_spi as a demonstration of the new interface. With this patch, you can CREATE EXTENSION worker_spi and then call worker_spi_launch(int4) to launch a new background worker, or combine it with generate_series() to launch a bunch at once. Then you can kill them off with pg_terminate_backend() and start some new ones. That, in my humble opinion, is pretty cool. It definitely is. Thanks again. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Tue, Jun 18, 2013 at 6:27 PM, Nicholas White n.j.wh...@gmail.com wrote: Thanks for the reviews. I've attached a revised patch that has the lexer refactoring Alvaro mentions (arbitarily using a goto rather than a bool flag) and uses Jeff's idea of just storing the index of the last non-null value (if there is one) in the window function's context (and not the Datum itself). However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will now fail. An earlier iteration of the patch had RESPECT and IGNORE as reserved, but that would have broken tables with columns called respect (etc.), which the current version allows. Is this backwards incompatibility acceptable? I think it's better to add new partially reserved keywords than to have this kind of random breakage. When you make something a partially-reserved keyword, then things break, but in a fairly well-defined way. Lexer hacks can break things in ways that are much subtler, which we may not even realize for a long time, and which in that case would mean that the word respect needs to be quoted in some contexts but not others. That's going to cause a lot of headaches, because pg_dump etc. know about quoting reserved keywords, but they don't know anything about quoting unreserved keywords in contexts where they might happen to be followed by the wrong next word. If not, maybe I should try doing a two-token lookahead in the token-aggregating code between the lexer and the parser (i.e. make a RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens, remembering to replace the OVER token)? Or what about adding an %expect statement to the Bison grammar, confirming that the shift / reduce conflicts caused by using the RESPECT, IGNORE NULL_P tokens the in out_clause rule are OK? These lines of inquiry don't seem promising to me. It's going to be complicated and unmaintainable and may just move the failure scenarios to cases that are too obscure for us to reason about. I think the question is whether this feature is really worth adding new reserved keywords for. I have a hard time saying we shouldn't support something that's part of the SQL standard, but personally, it's not something I've seen come up prior to this thread. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MVCC catalog access
On 2013-06-20 09:45:26 -0400, Robert Haas wrote: With that setup one can create quite a noticeable overhead for the mvcc patch (best of 5): master-optimize: tps = 1261.629474 (including connections establishing) tps = 15121.648834 (excluding connections establishing) dev-optimize: tps = 773.719637 (including connections establishing) tps = 2804.239979 (excluding connections establishing) Most of the time in both, patched and unpatched is by far spent in GetSnapshotData. I think the reason this shows a far higher overhead than what you previously measured is that a) in your test the other backends were idle, in mine they actually modify PGXACT which causes noticeable cacheline bouncing b) I have higher numer of connections #max_connections A quick test shows that even with max_connection=600, 400 background, and 100 foreground pgbenches there's noticeable overhead: master-optimize: tps = 2221.226711 (including connections establishing) tps = 31203.259472 (excluding connections establishing) dev-optimize: tps = 1629.734352 (including connections establishing) tps = 4754.449726 (excluding connections establishing) Now I grant that's a somewhat harsh test for postgres, but I don't think it's entirely unreasonable and the performance impact is quite stark. It's not entirely unreasonable, but it *is* mostly unreasonable. Well, sure. So are the tests that you ran. But that's *completely* fine. In contrast to evaluating whether a performance improvement is worth its complexity we're not trying to measure real world improvements. We're trying to test the worst cases we can think of, even if they aren't really interesting by stressing potential pain points. If we can't find a relevant regression for those using something akin to microbenchmarks it's less likely that there are performance regressions. The not entirely unreasonable point is just about making sure you're not testing something entirely irrelevant. Say, performance of a 1TB database when shared_buffers is set to 64k. Or testing DDL performance while locking pg_class exclusively. The test was specifically chosen to: * do uncached syscache lookups (-C) to mesure the impact of the added GetSnapshotData() calls * make individual GetSnapshotData() calls slower. (all processes have an xid) * contend on ProcArrayLock but not much else (high number of clients in the background) I mean, nobody is going to run 1000 connections in the background that do nothing but thrash PGXACT on a real system. I just can't get concerned about that. In the original mail I did retry it with 400 and the regression is still pretty big. And the background things could also be doing something that's not that likely to be blocked by global locks. Say, operate on temporary or unlogged tables. Or just acquire a single row level lock and then continue to do readonly work in a read committed transaction. I think we both can come up with scenarios where at least part of the above scenario is present. But imo that doesn't really matter. What I am concerned about is that there may be other, more realistic workloads that show similar regressions. But I don't know how to find out whether that's actually the case. So, given the results from that test and the profile I got where GetSnapshotData was by far the most expensive thing a more representative test might be something like a readonly pgbench with a moderately high number of short lived connections. I wouldn't be surprised if that still showed performance problems. If that's not enough something like: BEGIN; SELECT * FROM my_client WHERE client_id = :id FOR UPDATE; SELECT * FROM key_table WHERE key = :random ... SELECT * FROM key_table WHERE key = :random COMMIT; will sure still show the problem. On the IBM POWER box where I tested this, it's not even GetSnapshotData() that kills you; it's the system CPU scheduler. I haven't tried yet, but I'd guess the above setup shows the difference with less than 400 clients. Might make it more reasonable to run there. But I'm still on the fence about whether this is really a valid test. I think it shows that we need to be careful and do further performance evaluations and/or alleviate the pain by making things cheaper (say, a ddl counter in shared mem, allowing to cache snapshots for the syscache). If that artificial test hadn't shown problems I'd have voted for just going ahead and not worry further. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic background workers
On Thu, Jun 20, 2013 at 9:57 AM, Markus Wanner mar...@bluegap.ch wrote: That sounds like a good simplification. Even if it's an O(n) operation, the n in question here has relatively low practical limits. It's unlikely to be much of a concern, I guess. The constant factor is also very small. Generally, I would expect num_worker_processes ~ # CPUs, and scanning a 32, 64, or even 128 element array is not a terribly time-consuming operation. We might need to re-think this when systems with 4096 processors become commonplace, but considering how many other things would also need to be fixed to work well in that universe, I'm not too concerned about it just yet. One thing I think we probably want to explore in the future, for both worker backends and regular backends, is pre-forking. We could avoid some of the latency associated with starting up a new backend or opening a new connection in that way. However, there are quite a few details to be thought through there, so I'm not eager to pursue that just yet. Once we have enough infrastructure to implement meaningful parallelism, we can benchmark it and find out where the bottlenecks are, and which solutions actually help most. Back then, I solved it by having a fork request slot. After starting, the bgworker then had to clear that slot and register with a coordinator process (i.e. the original requestor), so that one learned its fork request was successful. At some point I expanded that to multiple request slots to better handle multiple concurrent fork requests. However, it was difficult to get right and requires more IPC than your approach. I do think we need a mechanism to allow the backend that requested the bgworker to know whether or not the bgworker got started, and whether it unexpectedly died. Once we get to the point of calling user code within the bgworker process, it can use any number of existing mechanisms to make sure that it won't die without notifying the backend that started it (short of a PANIC, in which case it won't matter anyway). But we need a way to report failures that happen before that point. I have some ideas about that, but decided to leave them for future passes. The remit of this patch is just to make it possible to dynamically register bgworkers. Allowing a bgworker to be tied to the session that requested it via some sort of feedback loop is a separate project - which I intend to tackle before CF2, assuming this gets committed (and so far nobody is objecting to that). I have attempted to make the code that transfers the shared_memory state into the postmaster's private state as paranoid as humanly possible. The precautions taken are documented in the comments. Conversely, when a background worker flagged as BGW_NEVER_RESTART is considered for restart (and we decide against it), the corresponding slot in the shared memory array is marked as no longer in use, allowing it to be reused for a new registration. Sounds like the postmaster is writing to shared memory. Not sure why I've been trying so hard to avoid that, though. After all, it can hardly hurt itself *writing* to shared memory. I think there's ample room for paranoia about postmaster interaction with shared memory, but all it's doing is setting a flag, which is no different from what CheckPostmasterSignal() already does. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum/visibility is busted
On 2013-06-19 15:01:44 -0700, Jeff Janes wrote: On Thu, Feb 7, 2013 at 12:01 PM, Andres Freund and...@2ndquadrant.comwrote: On 2013-02-07 11:15:46 -0800, Jeff Janes wrote: Does anyone have suggestions on how to hack the system to make it fast-forward the current transaction id? It would certainly make testing this kind of thing faster if I could make transaction id increment by 100 each time a new one is generated. Then wrap-around could be approached in minutes rather than hours. I had various plpgsql functions to do that, but those still took quite some time. As I needed it before I just spent some minutes hacking up a contrib module to do the job. Hi Andres, Your patch needs the file xidfuncs--1.0.sql, but does not include it. I could probably guess what needs to be in that file, but do you still have a copy of it? Hm. Sorry. Not sure how that happened. The commit in my local branch for that seems to have it ;). Attached. I doubt it really think it makes sense as a contrib module on its own though? Maybe PGXN? Hm. As of now I am not yet that convinced of PGXN for C containing extensions. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From b32cc0dca23f0477b26cb9d733b1d7ef391a77cc Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Wed, 20 Feb 2013 17:20:00 +0100 Subject: [PATCH] add xidfuncs --- contrib/Makefile | 3 ++- contrib/xidfuncs/Makefile | 18 ++ contrib/xidfuncs/xidfuncs--1.0.sql | 5 contrib/xidfuncs/xidfuncs.c| 50 ++ contrib/xidfuncs/xidfuncs.control | 5 5 files changed, 80 insertions(+), 1 deletion(-) create mode 100644 contrib/xidfuncs/Makefile create mode 100644 contrib/xidfuncs/xidfuncs--1.0.sql create mode 100644 contrib/xidfuncs/xidfuncs.c create mode 100644 contrib/xidfuncs/xidfuncs.control diff --git a/contrib/Makefile b/contrib/Makefile index fcd7c1e..64d10cc 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -51,7 +51,8 @@ SUBDIRS = \ tsearch2 \ unaccent \ vacuumlo \ - worker_spi + worker_spi \ + xidfuncs ifeq ($(with_openssl),yes) SUBDIRS += sslinfo diff --git a/contrib/xidfuncs/Makefile b/contrib/xidfuncs/Makefile new file mode 100644 index 000..6977a3b --- /dev/null +++ b/contrib/xidfuncs/Makefile @@ -0,0 +1,18 @@ +# contrib/xidfuncs/Makefile + +MODULE_big = xidfuncs +OBJS = xidfuncs.o + +EXTENSION = xidfuncs +DATA = xidfuncs--1.0.sql + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/xidfuncs +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/xidfuncs/xidfuncs--1.0.sql b/contrib/xidfuncs/xidfuncs--1.0.sql new file mode 100644 index 000..db7c9ed --- /dev/null +++ b/contrib/xidfuncs/xidfuncs--1.0.sql @@ -0,0 +1,5 @@ +CREATE FUNCTION burnxids(numxids int4) +RETURNS int8 +VOLATILE +AS 'MODULE_PATHNAME', 'burnxids' +LANGUAGE C; diff --git a/contrib/xidfuncs/xidfuncs.c b/contrib/xidfuncs/xidfuncs.c new file mode 100644 index 000..d24a48f --- /dev/null +++ b/contrib/xidfuncs/xidfuncs.c @@ -0,0 +1,50 @@ +/*- + * contrib/xidfuncs/xidfuncs.c + * + * Copyright (c) 2013, PostgreSQL Global Development Group + * + * IDENTIFICATION + * contrib/xidfuncs/xidfuncs.c + * + *- + */ + +#include postgres.h + +#include access/transam.h +#include access/xact.h +#include funcapi.h +#include storage/proc.h + +PG_MODULE_MAGIC; + +PG_FUNCTION_INFO_V1(burnxids); + +extern Datum burnxids(PG_FUNCTION_ARGS); + +Datum +burnxids(PG_FUNCTION_ARGS) +{ + int nxids = PG_GETARG_INT32(0); + int i; + TransactionId last; + + if (nxids = 1) + elog(ERROR, can't burn a negative amount of xids); + + if (nxids 130) + elog(ERROR, burning too many xids is dangerous); + + if (GetTopTransactionIdIfAny() != InvalidTransactionId) + elog(ERROR, can't burn xids in a transaction with xid); + + for (i = 0; i nxids; i++) + { + last = GetNewTransactionId(false); + } + + /* don't keep xid as assigned */ + MyPgXact-xid = InvalidTransactionId; + + return Int64GetDatum((int64)last); +} diff --git a/contrib/xidfuncs/xidfuncs.control b/contrib/xidfuncs/xidfuncs.control new file mode 100644 index 000..bca7194 --- /dev/null +++ b/contrib/xidfuncs/xidfuncs.control @@ -0,0 +1,5 @@ +# xidfuncs extension +comment = 'xid debugging functions' +default_version = '1.0' +module_pathname = '$libdir/xidfuncs' +relocatable = true -- 1.8.2.rc2.4.g7799588.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dump difference between 9.3 and master after upgrade
On Tue, Jun 18, 2013 at 12:18 PM, Andrew Dunstan and...@dunslane.net wrote: As I was updating my cross version upgrade tester to include support for the 9.3 branch, I noted this dump difference between the dump of the original 9.3 database and the dump of the converted database, which looks odd. Is it correct? It looks sketchy to me, but I'm not sure exactly what you're comparing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic background workers
On 06/20/2013 04:41 PM, Robert Haas wrote: The constant factor is also very small. Generally, I would expect num_worker_processes ~ # CPUs That assumption might hold for parallel querying, yes. In case of Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load, a cluster of n nodes, each with m backends performing transactions, all of them replicated to all other (n-1) nodes, you end up with ((n-1) * m) bgworkers. Which is pretty likely to be way above the # CPUs on any single node. I can imagine other extensions or integral features like autonomous transactions that might possibly want many more bgworkers as well. and scanning a 32, 64, or even 128 element array is not a terribly time-consuming operation. I'd extend that to say scanning an array with a few thousand elements is not terribly time-consuming, either. IMO the simplicity is worth it, ATM. It's all relative to your definition of ... eh ... terribly. .oO( ... premature optimization ... all evil ... ) We might need to re-think this when systems with 4096 processors become commonplace, but considering how many other things would also need to be fixed to work well in that universe, I'm not too concerned about it just yet. Agreed. One thing I think we probably want to explore in the future, for both worker backends and regular backends, is pre-forking. We could avoid some of the latency associated with starting up a new backend or opening a new connection in that way. However, there are quite a few details to be thought through there, so I'm not eager to pursue that just yet. Once we have enough infrastructure to implement meaningful parallelism, we can benchmark it and find out where the bottlenecks are, and which solutions actually help most. Do you mean pre-forking and connecting to a specific database? Or really just the forking? I do think we need a mechanism to allow the backend that requested the bgworker to know whether or not the bgworker got started, and whether it unexpectedly died. Once we get to the point of calling user code within the bgworker process, it can use any number of existing mechanisms to make sure that it won't die without notifying the backend that started it (short of a PANIC, in which case it won't matter anyway). But we need a way to report failures that happen before that point. I have some ideas about that, but decided to leave them for future passes. The remit of this patch is just to make it possible to dynamically register bgworkers. Allowing a bgworker to be tied to the session that requested it via some sort of feedback loop is a separate project - which I intend to tackle before CF2, assuming this gets committed (and so far nobody is objecting to that). Okay, sounds good. Given my background, I considered that a solved problem. Thanks for pointing it out. Sounds like the postmaster is writing to shared memory. Not sure why I've been trying so hard to avoid that, though. After all, it can hardly hurt itself *writing* to shared memory. I think there's ample room for paranoia about postmaster interaction with shared memory, but all it's doing is setting a flag, which is no different from what CheckPostmasterSignal() already does. Sounds good to me. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements
On Wed, Jun 12, 2013 at 3:00 PM, Peter Eisentraut pete...@gmx.net wrote: On 6/12/13 1:29 PM, Fabrízio de Royes Mello wrote: The attached patch add support to IF NOT EXISTS to CREATE statements listed below: - CREATE AGGREGATE [ IF NOT EXISTS ] ... - CREATE CAST [ IF NOT EXISTS ] ... - CREATE COLLATION [ IF NOT EXISTS ] ... - CREATE OPERATOR [ IF NOT EXISTS ] ... - CREATE TEXT SEARCH {PARSER | DICTIONARY | TEMPLATE | CONFIGURATION} [ IF NOT EXISTS ] ... - CREATE TYPE [ IF NOT EXISTS ] ... [AS [{ENUM | RANGE}] (...)] I'm wondering where IF NOT EXISTS and OR REPLACE will meet. I kind of don't see the point of having IF NOT EXISTS for things that have OR REPLACE, and am generally in favor of implementing OR REPLACE rather than IF NOT EXISTS where possible. The point is usually to get the object to a known state, and OR REPLACE will generally accomplish that better than IF NOT EXISTS. However, if the object has complex structure (like a table that contains data) then replacing it is a bad plan, so IF NOT EXISTS is really the best you can do - and it's still useful, even if it does require more care. Btw., I also want REPLACE BUT DO NOT CREATE. That's a mouthful. What's it good for? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MVCC catalog access
On Thu, Jun 20, 2013 at 10:35 AM, Andres Freund and...@2ndquadrant.com wrote: But I'm still on the fence about whether this is really a valid test. I think it shows that we need to be careful and do further performance evaluations and/or alleviate the pain by making things cheaper (say, a ddl counter in shared mem, allowing to cache snapshots for the syscache). If that artificial test hadn't shown problems I'd have voted for just going ahead and not worry further. I tried a general snapshot counter that rolls over every time any transaction commits, but that doesn't help much. It's a small improvement on general workloads, but it's not effective against this kind of hammering. A DDL counter would be a bit more expensive because we'd have to insert an additional branch into GetSnapshotData() while ProcArrayLock is held, but it might be tolerable. Do you have code for this (or some portion of it) already? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MVCC catalog access
On 2013-06-20 10:58:59 -0400, Robert Haas wrote: On Thu, Jun 20, 2013 at 10:35 AM, Andres Freund and...@2ndquadrant.com wrote: But I'm still on the fence about whether this is really a valid test. I think it shows that we need to be careful and do further performance evaluations and/or alleviate the pain by making things cheaper (say, a ddl counter in shared mem, allowing to cache snapshots for the syscache). If that artificial test hadn't shown problems I'd have voted for just going ahead and not worry further. I tried a general snapshot counter that rolls over every time any transaction commits, but that doesn't help much. It's a small improvement on general workloads, but it's not effective against this kind of hammering. A DDL counter would be a bit more expensive because we'd have to insert an additional branch into GetSnapshotData() while ProcArrayLock is held, but it might be tolerable. I actually wasn't thinking of adding it at that level. More like just not fetching a new snapshot in systable_* if a) the global ddl counter hasn't been increased b) our transactions hasn't performed any ddl. Instead use the one used the last time we fetched one for that purpose. The global ddl counter could be increased everytime we commit and the commit has invalidations queued. The local one everytime we execute local cache invalidation messages (i.e. around CommandCounterIncrement). I think something roughly along those lines should be doable without adding new overhead to global paths. Except maybe some check in snapmgr.c for that new, longer living, snapshot. Do you have code for this (or some portion of it) already? Nope, sorry. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Frontend/backend protocol improvements proposal (request).
Hackers, While developing a C++ client library for Postgres I felt lack of extra information in command tags in the CommandComplete (B) message for the following commands: PREPARE; DEALLOCATE; DECLARE; CLOSE; LISTEN; UNLISTEN; SET; RESET. Namely, for example, users of my library can prepare statements by using protocol directly or via PREPARE command. Since the protocol does not supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE command. The library knows about all prepared statements and invalidates them automatically when user performs deallocate() wrapper. But users can go with DEALLOCATE command directly and in these cases I need to query the database to get the list of currently prepared statements whenever CommandComplete message with DEALLOCATE command tag is consumed. Moreover, I need to do it *synchronously* and this breaks asynchronous API. I propose to include name of the object in the CommandComplete (B) message for the above commands. -- // Dmitriy.
Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
On 2013-06-20 13:45:24 +0800, Craig Ringer wrote: On 06/12/2013 07:51 PM, Andres Freund wrote: On 2013-06-12 19:47:46 +0800, Craig Ringer wrote: On 06/12/2013 05:55 PM, Greg Stark wrote: On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer cr...@2ndquadrant.com wrote: The main thing I'm wondering is how/if to handle backward compatibility with the existing NUMERIC and its DECIMAL alias If it were 100% functionally equivalent you could just hide the implementation internally. Have a bit that indicates which representation was stored and call the right function depending. That's what I was originally wondering about, but as Tom pointed out it won't work. We'd still need to handle scale and precision greater than that offered by _Decimal128 and wouldn't know in advance how much scale/precision they wanted to preserve. So we'd land up upcasting everything to NUMERIC whenever we did anything with it anyway, only to then convert it back into the appropriate fixed size decimal type for storage. Well, you can limit the upcasting to the cases where we would exceed the precision. How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or sqrt(DECIMAL '2') ? Well, the suggestion above was not to actually implement them as separate types. If you only store the precision inside the Datum you can limit the upcasting to whatever you need. I think a good starting point would be to use the Intel and IBM libraries to implement basic DECIMAL32/64/128 to see if they perform better than the gcc builtins tested by Pavel by adapting his extension. Another good thing to investigate early on is whether there's actually a need for the feature outside complying to standards. Pretty pointless, and made doubly so by the fact that if we're not using a nice fixed-width type and have to support VARLENA we miss out on a whole bunch of performance benefits. I rather doubt that using a 1byte varlena - which it will be for reasonably sized Datums - will be a relevant bottleneck here. Maybe if you only have 'NOT NULL', fixed width columns, but even then... That's good to know - if I've overestimated the cost of using VARLENA for this, that's really quite good news. From what I remember seing in profiles the biggest overhead is that the short varlenas (not long ones though) frequently need to be copied around so they are placed at an aligned address. I think with some care numeric.c could be made to avoid that for the most common cases which should speed up things nicely. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dump difference between 9.3 and master after upgrade
On 06/20/2013 10:43 AM, Robert Haas wrote: On Tue, Jun 18, 2013 at 12:18 PM, Andrew Dunstan and...@dunslane.net wrote: As I was updating my cross version upgrade tester to include support for the 9.3 branch, I noted this dump difference between the dump of the original 9.3 database and the dump of the converted database, which looks odd. Is it correct? It looks sketchy to me, but I'm not sure exactly what you're comparing. Essentially, cross version upgrade testing runs pg_dumpall from the new version on the old cluster, runs pg_upgrade, and then runs pg_dumpall on the upgraded cluster, and compares the two outputs. This is what we get when the new version is HEAD and the old version is 9.3. The reason this hasn't been caught by the standard same-version upgrade tests is that this module uses a more extensive cluster, which has had not only the core regression tests run but also all the contrib and pl regression tests, and this problem seems to be exposed by the postgres_fdw tests. At first glance to me like pg_dump in binary-upgrade mode is not suppressing something that it should be suppressing. The cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic background workers
On Thu, Jun 20, 2013 at 10:59 AM, Markus Wanner mar...@bluegap.ch wrote: On 06/20/2013 04:41 PM, Robert Haas wrote: The constant factor is also very small. Generally, I would expect num_worker_processes ~ # CPUs That assumption might hold for parallel querying, yes. In case of Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load, a cluster of n nodes, each with m backends performing transactions, all of them replicated to all other (n-1) nodes, you end up with ((n-1) * m) bgworkers. Which is pretty likely to be way above the # CPUs on any single node. I can imagine other extensions or integral features like autonomous transactions that might possibly want many more bgworkers as well. Yeah, maybe. I think in general it's not going to work great to have zillions of backends floating around, because eventually the OS scheduler overhead - and the memory overhead - are going to become pain points. And I'm hopeful that autonomous transactions can be implemented without needing to start a new backend for each one, because that sounds pretty expensive. Some users of other database products will expect autonomous transactions to be cheap; aside from that, cheap is better than expensive. But we will see. At any rate I think your basic point is that people might end up creating a lot more background workers than I'm imagining, which is certainly a fair point. and scanning a 32, 64, or even 128 element array is not a terribly time-consuming operation. I'd extend that to say scanning an array with a few thousand elements is not terribly time-consuming, either. IMO the simplicity is worth it, ATM. It's all relative to your definition of ... eh ... terribly. .oO( ... premature optimization ... all evil ... ) Yeah, that thing. One thing I think we probably want to explore in the future, for both worker backends and regular backends, is pre-forking. We could avoid some of the latency associated with starting up a new backend or opening a new connection in that way. However, there are quite a few details to be thought through there, so I'm not eager to pursue that just yet. Once we have enough infrastructure to implement meaningful parallelism, we can benchmark it and find out where the bottlenecks are, and which solutions actually help most. Do you mean pre-forking and connecting to a specific database? Or really just the forking? I've considered both at various times, although in this context I was mostly thinking about just the forking. Pre-connecting to a specific database would save an unknown but possibly significant amount of additional latency. Against that, it's more complex (because we've got to track which preforked workers are associated with which databases) and there's some cost to guessing wrong (because then we're keeping workers around that we can't use, or maybe even having to turn around and kill them to make slots for the workers we actually need). I suspect we'll want to pursue the idea at some point but it's not near the top of my list. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic background workers
On 2013-06-20 11:29:27 -0400, Robert Haas wrote: Do you mean pre-forking and connecting to a specific database? Or really just the forking? I've considered both at various times, although in this context I was mostly thinking about just the forking. Pre-connecting to a specific database would save an unknown but possibly significant amount of additional latency. Against that, it's more complex (because we've got to track which preforked workers are associated with which databases) and there's some cost to guessing wrong (because then we're keeping workers around that we can't use, or maybe even having to turn around and kill them to make slots for the workers we actually need). I suspect we'll want to pursue the idea at some point but it's not near the top of my list. Just as a datapoint, if you benchmark the numbers of forks that can be performed by a single process (i.e. postmaster) the number is easily in the 10s of thousands. Now forking that much has some scalability implications inside the kernel, but still. I'd be surprised if the actual fork is more than 5-10% of the current cost of starting a new backend. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements
On 6/20/13 11:04 AM, Robert Haas wrote: I kind of don't see the point of having IF NOT EXISTS for things that have OR REPLACE, and am generally in favor of implementing OR REPLACE rather than IF NOT EXISTS where possible. I tend to agree. Btw., I also want REPLACE BUT DO NOT CREATE. That's a mouthful. What's it good for? If you run an upgrade SQL script that is supposed to replace, say, a bunch of functions with new versions, you'd want the behavior that it replaces the existing function if it exists, but errors out if it doesn't, because then you're perhaps connected to the wrong database. It's a marginal feature, and I'm not going to pursue it, but if someone wanted to make the CREATE commands fully featured, there is use for this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
MauMau escribió: First, thank you for the review. From: Alvaro Herrera alvhe...@2ndquadrant.com This seems reasonable. Why 10 seconds? We could wait 5 seconds, or 15. Is there a rationale behind the 10? If we said 60, that would fit perfectly well within the already existing 60-second loop in postmaster, but that seems way too long. There is no good rationale. I arbitrarily chose a short period because this is immediate shutdown. I felt more than 10 second was long. I think 5 second may be better. Although not directly related to this fix, these influenced my choice: 1. According to the man page of init, init sends SIGKILL to all remaining processes 5 seconds after it sends SIGTERM to them. 2. At computer shutdown, Windows proceeds shutdown forcibly after waiting for services to terminate 20 seconds. Well, as for the first of these, I don't think it matters whether processes get their SIGKILL from postmaster or from init, when a shutdown or runlevel is changing. Now, one would think that this sets a precedent. I think 20 seconds might be too long; perhaps it's expected in an operating system that forcibly has a GUI. I feel safe to ignore that. I will go with 5 seconds, then. Note that the previous text said that postmaster will send SIGQUIT, then terminate without checking anything. In the new code, postmaster sends SIGQUIT, then waits, then SIGKILL, then waits again. If there's an unkillable process (say because it's stuck in a noninterruptible sleep) postmaster might never exit. I think it should send SIGQUIT, then wait, then SIGKILL, then exit without checking. At first I thought the same, but decided not to do that. The purpose of this patch is to make the immediate shutdown reliable. My point is that there is no difference. For one thing, once we enter immediate shutdown state, and sigkill has been sent, no further action is taken. Postmaster will just sit there indefinitely until processes are gone. If we were to make it repeat SIGKILL until they die, that would be different. However, repeating SIGKILL is pointless, because it they didn't die when they first received it, they will still not die when they receive it second. Also, if they're in uninterruptible sleep and don't die, then they will die as soon as they get out of that state; no further queries will get processed, no further memory access will be done. So there's no harm in they remaining there until underlying storage returns to life, ISTM. Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? I have tweaked the patch a bit and I'm about to commit as soon as we resolve the above two items. I appreciate your tweaking, especially in the documentation and source code comments, as English is not my mother tongue. IIRC the only other interesting tweak I did was rename the SignalAllChildren() function to TerminateChildren(). I did this because it doesn't really signal all children; syslogger and dead_end backends are kept around. So the original name was a bit misleading. And we couldn't really name it SignalAlmostAllChildren(), could we .. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench --startup option
On Sun, Jun 16, 2013 at 9:42 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Thu, May 2, 2013 at 11:25 AM, Fabien COELHO coe...@cri.ensmp.frwrote: Thanks for doing the review. I'm not sure what things to go change without further feedback/discussion, except point 4. I'll wait a day to see if I get more feedback on the other issues and submit a new patch. I've fixed a conflict, and I've removed extraneous semicolons from the C. I've left in the fixing of some existing bad indenting in the existing code, which is not strictly related to my change. I hope my defenses of the other points were persuasive. Cheers, Jeff pgbench_startup-v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument
On Thu, Jun 20, 2013 at 11:43 AM, Satoshi Nagayasu sn...@uptime.jp wrote: (2013/06/17 4:02), Fujii Masao wrote: On Sat, Mar 9, 2013 at 3:23 PM, Satoshi Nagayasu sn...@uptime.jp wrote: It is obviously easy to keep two types of function interfaces, one with regclass-type and another with text-type, in the next release for backward-compatibility like below: pgstattuple(regclass) -- safer interface. pgstattuple(text) -- will be depreciated in the future release. So you're thinking to remove pgstattuple(oid) soon? AFAIK, a regclass type argument would accept an OID value, which means regclass type has upper-compatibility against oid type. So, even if the declaration is changed, compatibility could be kept actually. This test case (in sql/pgstattuple.sql) confirms that. select * from pgstatindex('myschema.test_pkey'::regclass::oid); version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation -+++---+++-+---+--+ 2 | 0 | 8192 | 1 | 0 | 1 | 0 | 0 | 0.79 |0 (1 row) Having both interfaces for a while would allow users to have enough time to rewrite their applications. Then, we will be able to obsolete (or just drop) old interfaces in the future release, maybe 9.4 or 9.5. I think this approach would minimize an impact of such interface change. So, I think we can clean up function arguments in the pgstattuple module, and also we can have two interfaces, both regclass and text, for the next release. Any comments? In the document, you should mark old functions as deprecated. I'm still considering changing the function name as Tom pointed out. How about pgstatbtindex? Or just adding pgstatindex(regclass)? In fact, pgstatindex does support only BTree index. So, pgstatbtindex seems to be more appropriate for this function. Can most ordinary users realize bt means btree? We can keep having both (old) pgstatindex and (new) pgstatbtindex during next 2-3 major releases, and the old one will be deprecated after that. Since pg_relpages(oid) doesn't exist, pg_relpages() is in the same situation as pgstatindex(), i.e., we cannot just replace pg_relpages(text) with pg_relpages(regclass) for the backward-compatibility. How do you think we should solve the pg_relpages() problem? Rename? Just add pg_relpages(regclass)? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench --startup option
I've fixed a conflict, and I've removed extraneous semicolons from the C. I've left in the fixing of some existing bad indenting in the existing code, which is not strictly related to my change. There are still unrelated changes : spacing on -c and -t options' help. The pgindent command is passed on the sources from time to time, so there should be no reason to change this in this commit. The updated string for PQerrorMessage does not bring much, and the message does not seem an improvement. Command failed with ERROR, indeed. Command failed with ERROR: syntax error at or near ; LINE 1: set synchronous_commit=on;set synchronous_; The preceding result seems bother simpler and fine: ERROR: syntax error at or near ; LINE 1: set synchronous_commit=on;set synchronous_; Otherwise I've tested the patch with one set, two sets and a syntax error, and it worked as expected. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote: But, couldn't that be solved by deprecating that function and providing a more sensible alternatively named version? And what would you name that function? array_dims2? I can't think of a name that makes the difference in behaviour apparent. Can you imagine the documentation for that? I don't know the answer to that, but I think it's hard to argue that deprecating and documenting a few functions is a heavier burden on your users than having to sift through older arcane code before upgrading to the latest version of the database. We're not the only ones stuck with lousy old functions (C finally ditched gets() in the 2011 standard). I also happen to think the current array_api function names are not particularly great (especially array_upper/array_lower) so I won't shed too many tears. Sorry to be late on this, but are you saying people have code that is testing: select array_dims('{}'::int[]) for a NULL return, and they would need to change that to test for zero? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
On 17 June 2013 06:36, David Fetter da...@fetter.org wrote: Please find attached two versions of a patch which provides optional FILTER clause for aggregates (T612, Advanced OLAP operations). The first is intended to be applied on top of the previous patch, the second without it. The first is, I believe, clearer in what it's doing. Rather than simply mechanically visiting every place a function call might be constructed, it visits a central one to change the default, then goes only to the places where it's relevant. The patches are both early WIP as they contain no docs or regression tests yet. Docs and regression tests added, makeFuncArgs approached dropped for now, will re-visit later. Regression tests added to reflect bug fixes in COLLATE. Rebased vs. master. Hi, I've been looking at this patch, which adds support for the SQL standard feature of applying a filter to rows used in an aggregate. The syntax matches the spec: agg_fn ( args [ ORDER BY sort_clause ] ) [ FILTER ( WHERE qual ) ] and this patch makes FILTER a new reserved keyword, usable as a function or type, but not in other contexts, e.g., as a table name or alias. I'm not sure if that might be a problem for some people, but I can't see any way round it, since otherwise there would be no way for the parser to distinguish a filter clause from an alias expression. The feature appears to work per-spec, and the code and doc changes look reasonable. However, it looks a little light on regression tests, and so I think some necessary changes have been overlooked. In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: CREATE TABLE t1(a1 int); CREATE TABLE t2(a2 int); INSERT INTO t1 SELECT * FROM generate_series(1,10); INSERT INTO t2 SELECT * FROM generate_series(3,6); SELECT array_agg(a1) FILTER (WHERE a1 IN (SELECT a2 FROM t2)) FROM t1; ERROR: plan should not reference subplan's variable which looks to be related to subselect.c's handling of sub-queries in aggregates. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote: But, couldn't that be solved by deprecating that function and providing a more sensible alternatively named version? And what would you name that function? array_dims2? I can't think of a name that makes the difference in behaviour apparent. Can you imagine the documentation for that? I don't know the answer to that, but I think it's hard to argue that deprecating and documenting a few functions is a heavier burden on your users than having to sift through older arcane code before upgrading to the latest version of the database. We're not the only ones stuck with lousy old functions (C finally ditched gets() in the 2011 standard). I also happen to think the current array_api function names are not particularly great (especially array_upper/array_lower) so I won't shed too many tears. Sorry to be late on this, but are you saying people have code that is testing: select array_dims('{}'::int[]) for a NULL return, and they would need to change that to test for zero? Kinda -- what I'm saying is you just don't go around changing function behaviors to make them 'better' unless the affected behavior was specifically reserved as undefined. The fact is nobody knows how many users will be affected and the extent of the ultimate damage (pro tip: it's always more and worse than expected); I'm astonished it's even being considered. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
From: Alvaro Herrera alvhe...@2ndquadrant.com I will go with 5 seconds, then. OK, I agree. My point is that there is no difference. For one thing, once we enter immediate shutdown state, and sigkill has been sent, no further action is taken. Postmaster will just sit there indefinitely until processes are gone. If we were to make it repeat SIGKILL until they die, that would be different. However, repeating SIGKILL is pointless, because it they didn't die when they first received it, they will still not die when they receive it second. Also, if they're in uninterruptible sleep and don't die, then they will die as soon as they get out of that state; no further queries will get processed, no further memory access will be done. So there's no harm in they remaining there until underlying storage returns to life, ISTM. Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? I think you are right, and there is no harm in leaving postgres processes in unkillable state. I'd like to leave the decision to you and/or others. One concern is that umount would fail in such a situation because postgres has some open files on the filesystem, which is on the shared disk in case of traditional HA cluster. However, STONITH should resolve the problem by terminating the stuck node... I just feel it is strange for umount to fail due to remaining postgres, because pg_ctl stop -mi reported success. IIRC the only other interesting tweak I did was rename the SignalAllChildren() function to TerminateChildren(). I did this because it doesn't really signal all children; syslogger and dead_end backends are kept around. So the original name was a bit misleading. And we couldn't really name it SignalAlmostAllChildren(), could we .. I see. thank you. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changeset generation v5-01 - Patches git tree
Andres Freund and...@2ndquadrant.com wrote: 0007: Adjust Satisfies* interface: required, mechanical, Version v5-01 attached I'm still working on a review and hope to post something more substantive by this weekend, but when applying patches in numeric order, this one did not compile cleanly. gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o allpaths.o allpaths.c -MMD -MP -MF .deps/allpaths.Po vacuumlazy.c: In function ‘heap_page_is_all_visible’: vacuumlazy.c:1725:3: warning: passing argument 1 of ‘HeapTupleSatisfiesVacuum’ from incompatible pointer type [enabled by default] In file included from vacuumlazy.c:61:0: ../../../src/include/utils/tqual.h:84:20: note: expected ‘HeapTuple’ but argument is of type ‘HeapTupleHeader’ Could you post a new version of that? -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] isolationtester and 'specs' subdirectory
Hi all, I have a Debian machine with gcc 4.7.2-5 where make check-world fails in the isolation check, like so: ... make[2]: Leaving directory `/home/josh/src/postgresql/src/test/regress' make -C isolation check [snip] gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -I. -I../../../src/interfaces/libpq -I./../regress -I../../../src/include -D_GNU_SOURCE -c -o isolation_main.o isolation_main.c gcc: error: ./specs: Is a directory make[2]: *** [isolation_main.o] Error 1 ... I eventually tracked down the cause of this failure to a trailing ':' in my $LIBRARY_PATH, which causes gcc to look inside the current directory for a 'specs' file [1] among other things. Although I probably don't need that trailing ':', it seems like we should avoid naming this directory 'specs' nonetheless to avoid confusion with gcc. Renaming the 'specs' directory to something like 'isolation_specs' and adjusting isolation_main.c accordingly lets me pass `make check-world`. Proposed patch attached. Josh [1] http://gcc.gnu.org/ml/gcc-help/2010-05/msg00292.html rename_specs.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote: MauMau escribi?: Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? It would be valuable for pg_ctl -w -m immediate stop to have the property that an subsequent start attempt will not fail due to the presence of some backend still attached to shared memory. (Maybe that's true anyway or can be achieved a better way; I have not investigated.) -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
On Thu, Jun 20, 2013 at 03:33:24PM -0500, Merlin Moncure wrote: On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote: But, couldn't that be solved by deprecating that function and providing a more sensible alternatively named version? And what would you name that function? array_dims2? I can't think of a name that makes the difference in behaviour apparent. Can you imagine the documentation for that? I don't know the answer to that, but I think it's hard to argue that deprecating and documenting a few functions is a heavier burden on your users than having to sift through older arcane code before upgrading to the latest version of the database. We're not the only ones stuck with lousy old functions (C finally ditched gets() in the 2011 standard). I also happen to think the current array_api function names are not particularly great (especially array_upper/array_lower) so I won't shed too many tears. Sorry to be late on this, but are you saying people have code that is testing: select array_dims('{}'::int[]) for a NULL return, and they would need to change that to test for zero? Kinda -- what I'm saying is you just don't go around changing function behaviors to make them 'better' unless the affected behavior was specifically reserved as undefined. The fact is nobody knows how many users will be affected and the extent of the ultimate damage (pro tip: it's always more and worse than expected); I'm astonished it's even being considered. Well, I think the question is how many people have such arrays that will be effected. If we don't do something, we live with this odd behavior forever. We have been willing to make some bold decisions in the past to improve user experience, and it mostly has worked out well. I disagree that it is always worse than expected. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian br...@momjian.us wrote: Kinda -- what I'm saying is you just don't go around changing function behaviors to make them 'better' unless the affected behavior was specifically reserved as undefined. The fact is nobody knows how many users will be affected and the extent of the ultimate damage (pro tip: it's always more and worse than expected); I'm astonished it's even being considered. Well, I think the question is how many people have such arrays that will be effected. If we don't do something, we live with this odd behavior forever. We have been willing to make some bold decisions in the past to improve user experience, and it mostly has worked out well. I disagree that it is always worse than expected. Well, you can have the last word (although 'bold' was an interesting word choice, heh) -- I feel guilty enough about beating up Brendan already. I feel this way every time compatibility changes come up, so it's nothing specific to this patch really. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
On Thu, Jun 20, 2013 at 07:13:48PM -0500, Merlin Moncure wrote: On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian br...@momjian.us wrote: Kinda -- what I'm saying is you just don't go around changing function behaviors to make them 'better' unless the affected behavior was specifically reserved as undefined. The fact is nobody knows how many users will be affected and the extent of the ultimate damage (pro tip: it's always more and worse than expected); I'm astonished it's even being considered. Well, I think the question is how many people have such arrays that will be effected. If we don't do something, we live with this odd behavior forever. We have been willing to make some bold decisions in the past to improve user experience, and it mostly has worked out well. I disagree that it is always worse than expected. Well, you can have the last word (although 'bold' was an interesting word choice, heh) -- I feel guilty enough about beating up Brendan already. I feel this way every time compatibility changes come up, so it's nothing specific to this patch really. Well, sometimes we underestimate the impact of changes, sometimes we overestimate. The big problem is weighing the short-term problems of change but not the long-term benefit of a change. This array problem goes back to at least 2008: http://www.postgresql.org/message-id/28026.1224611...@sss.pgh.pa.us so we have at least five years of confusion by not changing it then. I am not saying we need to change it, but do think we need to weigh both issues. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
Bruce, Well, sometimes we underestimate the impact of changes, sometimes we overestimate. The big problem is weighing the short-term problems of change but not the long-term benefit of a change. This array problem goes back to at least 2008: http://www.postgresql.org/message-id/28026.1224611...@sss.pgh.pa.us so we have at least five years of confusion by not changing it then. I am not saying we need to change it, but do think we need to weigh both issues. As much as I hate the current behavior (my first response was yeah, fix those babies!), I think we don't have a choice about creating new function names and then waiting three years to deprecate the old ones. We really can't afford to put obstacles in the way of people upgrading, especially over an issue as minor as this one. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why can't I use windowing functions over ordered aggregates?
Hackers, So, I can create a custom aggregate first and do this: SELECT first(val order by ts desc) ... And I can do this: SELECT first_value(val) OVER (order by ts desc) ... but I can't do this: SELECT first_value(val order by ts desc) ... even though under the hood, it's the exact same operation. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)
On Thu, Jun 20, 2013 at 06:28:07PM -0700, Josh Berkus wrote: Bruce, Well, sometimes we underestimate the impact of changes, sometimes we overestimate. The big problem is weighing the short-term problems of change but not the long-term benefit of a change. This array problem goes back to at least 2008: http://www.postgresql.org/message-id/28026.1224611...@sss.pgh.pa.us so we have at least five years of confusion by not changing it then. I am not saying we need to change it, but do think we need to weigh both issues. As much as I hate the current behavior (my first response was yeah, fix those babies!), I think we don't have a choice about creating new function names and then waiting three years to deprecate the old ones. We really can't afford to put obstacles in the way of people upgrading, especially over an issue as minor as this one. Perhaps we need to mark the TODO item as will not fix. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] trgm regex index peculiarity
9.4devel (but same in 9.3) In a 112 MB test table (containing random generated text) with a trgm index (gin_trgm_ops), I consistently get these timings: select txt from azjunk6 where txt ~ '^abcd'; 130 ms select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; 3 ms (a similar performance difference occurs when using a regex, i.e. 'abc[de]' ) This difference is so large that I wonder if there is not something wrong in the first case. (The returned results are correct though) Here are the two explains: explain analyze select txt from azjunk6 where txt ~ '^abcd'; QUERY PLAN - Bitmap Heap Scan on azjunk6 (cost=108.78..484.93 rows=100 width=81) (actual time=129.557..129.742 rows=1 loops=1) Recheck Cond: (txt ~ '^abcd'::text) Rows Removed by Index Recheck: 17 - Bitmap Index Scan on azjunk6_trgm_re_idx (cost=0.00..108.75 rows=100 width=0) (actual time=129.503..129.503 rows=18 loops=1) Index Cond: (txt ~ '^abcd'::text) Total runtime: 130.008 ms (6 rows) explain analyze select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; QUERY PLAN - Bitmap Heap Scan on azjunk6 (cost=56.75..433.40 rows=1 width=81) (actual time=2.064..3.379 rows=1 loops=1) Recheck Cond: (txt ~ 'abcd'::text) Rows Removed by Index Recheck: 14 Filter: (substr(txt, 1, 4) = 'abcd'::text) Rows Removed by Filter: 112 - Bitmap Index Scan on azjunk6_trgm_re_idx (cost=0.00..56.75 rows=100 width=0) (actual time=1.911..1.911 rows=127 loops=1) Index Cond: (txt ~ 'abcd'::text) Total runtime: 3.409 ms (8 rows) The results in both cases are correct, but does this difference not almost amount to a bug? ( Interestingly, the variant WHERE txt ~ 'abcd$' is as fast as the non-anchored variant ) Thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)
On Sun, 2013-06-16 at 23:53 -0500, Jon Nelson wrote: Please find attached v5 of the patch, with the above correction in place. The only change from the v4 patch is wrapping the if (wal_use_fallocate) block in an #ifdef USE_POSIX_FALLOCATE. Thanks! Thank you. Greg, are you still working on those tests? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Support for RANGE ... PRECEDING windows in OVER
Hi all Since 8.4, PostgreSQL has had extremely useful window function support - but support for RANGE PRECEDING / FOLLOWING windows was dropped late in 8.4's development in order to get the rest of the feature in, per http://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php. It looks like there was discussion of requiring a new opclass to be declared for types or otherwise extending opclasses to provide the information required for RANGE ... PRECEDING / FOLLOWING ( http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org ) . I can't find any sign that it went anywhere beyond some broad discussion: http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us at the time. I've missed this feature more than once, and am curious about whether any more recent changes may have made it cleaner to tackle this, or whether consensus can be formed on adding the new entries to btree's opclass to avoid the undesirable explicit lookups of the '+' and '-' oprators. Some question seems to remain open about how ranges over timestamps/intervals should work, but this wasn't elaborated on. There's been interest in this, eg: http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1 http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
I am currently looking into this feature. However, as I am quite new to Postgres, I think it might take me a while to get up to speed. Anyways, I would also appreciate another round of discussion on the future of the windowing functions. Ian Link Craig Ringer Thursday, June 20, 2013 7:24 PM Hi allSince 8.4, PostgreSQL has had extremely useful window function support -but support for "RANGE PRECEDING / FOLLOWING" windows was dropped latein 8.4's development in order to get the rest of the feature in, perhttp://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.It looks like there was discussion of requiring a new opclass to bedeclared for types or otherwise extending opclasses to provide theinformation required for RANGE ... PRECEDING / FOLLOWING (http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org ). I can't find any sign that it went anywhere beyond some broaddiscussion:http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us atthe time.I've missed this feature more than once, and am curious about whetherany more recent changes may have made it cleaner to tackle this, orwhether consensus can be formed on adding the new entries to btree'sopclass to avoid the undesirable explicit lookups of the '+' and '-'oprators.Some question seems to remain open about how ranges overtimestamps/intervals should work, but this wasn't elaborated on.There's been interest in this, eg:http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions
[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Noah Misch escribió: On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote: MauMau escribi?: Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? It would be valuable for pg_ctl -w -m immediate stop to have the property that an subsequent start attempt will not fail due to the presence of some backend still attached to shared memory. (Maybe that's true anyway or can be achieved a better way; I have not investigated.) Well, the only case where a process that's been SIGKILLed does not go away, as far as I know, is when it is in some uninterruptible sleep due to in-kernel operations that get stuck. Personally I have never seen this happen in any other case than some network filesystem getting disconnected, or a disk that doesn't respond. And whenever the filesystem starts to respond again, the process gets out of its sleep only to die due to the signal. So a subsequent start attempt will either find that the filesystem is not responding, in which case it'll probably fail to work properly anyway (presumably the filesystem corresponds to part of the data directory), or that it has revived in which case the old backends have already gone away. If we leave postmaster running after SIGKILLing its children, the only thing we can do is have it continue to SIGKILL processes continuously every few seconds until they die (or just sit around doing nothing until they all die). I don't think this will have a different effect than postmaster going away trusting the first SIGKILL to do its job eventually. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On 06/21/2013 10:31 AM, Ian Link wrote: I am currently looking into this feature. However, as I am quite new to Postgres, I think it might take me a while to get up to speed. Anyways, I would also appreciate another round of discussion on the future of the windowing functions. Good to know, and welcome. I hope the links to the archived discussions on the matter were useful to you. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
MauMau escribió: From: Alvaro Herrera alvhe...@2ndquadrant.com One concern is that umount would fail in such a situation because postgres has some open files on the filesystem, which is on the shared disk in case of traditional HA cluster. See my reply to Noah. If postmaster stays around, would this be any different? I don't think so. IIRC the only other interesting tweak I did was rename the SignalAllChildren() function to TerminateChildren(). I did this because it doesn't really signal all children; syslogger and dead_end backends are kept around. So the original name was a bit misleading. And we couldn't really name it SignalAlmostAllChildren(), could we .. I see. thank you. Actually, in further testing I noticed that the fast-path you introduced in BackendCleanup (or was it HandleChildCrash?) in the immediate shutdown case caused postmaster to fail to clean up properly after sending the SIGKILL signal, so I had to remove that as well. Was there any reason for that? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
Thanks! The discussions have been useful, although I am currently just reviewing the code. I think a good starting point will be to refactor/imrpove the WinGetFuncArgInPartition and WinGetFuncArgInFrame functions. Tom Lane wrote this about them before comitting the patch: I'm not terribly happy with the changes you made in WinGetFuncArgInPartitionand WinGetFuncArgInFrame to force the window function mark to not gopast frame start in some modes. Not only is that pretty ugly, but Ithink it can mask bugs in window functions: it's an error for a windowfunction to fetch a row before what it has set its mark to be, but insome cases that wouldn't be detected because of this change. I thinkit would be better to revert those changes and find another method ofprotecting fetches needed to determine the frame head. One idea isto create a separate read pointer that tracks the frame head wheneveractual fetches of the frame head might be needed by update_frameheadpos.I committed it without changing that, but I think this should berevisited before trying to add the RANGE value PRECEDING/FOLLOWINGoptions, because those will substantially expand the number of caseswhere that hack affects the behavior. I am honestly not 100% certain why these functions have issues, but this seems a good place to start investigating. Ian Link Craig Ringer Thursday, June 20, 2013 7:37 PM Good to know, and welcome.I hope the links to the archived discussions on the matter were usefulto you. Craig Ringer Thursday, June 20, 2013 7:24 PM Hi allSince 8.4, PostgreSQL has had extremely useful window function support -but support for "RANGE PRECEDING / FOLLOWING" windows was dropped latein 8.4's development in order to get the rest of the feature in, perhttp://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.It looks like there was discussion of requiring a new opclass to bedeclared for types or otherwise extending opclasses to provide theinformation required for RANGE ... PRECEDING / FOLLOWING (http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org ). I can't find any sign that it went anywhere beyond some broaddiscussion:http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us atthe time.I've missed this feature more than once, and am curious about whetherany more recent changes may have made it cleaner to tackle this, orwhether consensus can be formed on adding the new entries to btree'sopclass to avoid the undesirable explicit lookups of the '+' and '-'oprators.Some question seems to remain open about how ranges overtimestamps/intervals should work, but this wasn't elaborated on.There's been interest in this, eg:http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions
Re: [HACKERS] single-user vs standalone in docs and messages
On Thu, 2013-06-13 at 15:10 -0700, Jeff Janes wrote: Some places in the docs and elog hints refer to standalone backends, while the official name as used in app-postgres.html is single-user mode, and in fact standalone does not appear on that page. This tries to standardize the other locations to use single-user. I think I did the right thing with the message translation files, but I can't figure out how to test that. I made no attempt to change code-comments, just the user-facing parts. committed with some adjustments -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trgm regex index peculiarity
Erik Rijkers e...@xs4all.nl writes: In a 112 MB test table (containing random generated text) with a trgm index (gin_trgm_ops), I consistently get these timings: select txt from azjunk6 where txt ~ '^abcd'; 130 ms select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; 3 ms Hm, could you provide a self-contained test case? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: On 17 June 2013 06:36, David Fetter da...@fetter.org wrote: Please find attached two versions of a patch which provides optional FILTER clause for aggregates (T612, Advanced OLAP operations). The first is intended to be applied on top of the previous patch, the second without it. The first is, I believe, clearer in what it's doing. Rather than simply mechanically visiting every place a function call might be constructed, it visits a central one to change the default, then goes only to the places where it's relevant. The patches are both early WIP as they contain no docs or regression tests yet. Docs and regression tests added, makeFuncArgs approached dropped for now, will re-visit later. Regression tests added to reflect bug fixes in COLLATE. Rebased vs. master. Hi, I've been looking at this patch, which adds support for the SQL standard feature of applying a filter to rows used in an aggregate. The syntax matches the spec: agg_fn ( args [ ORDER BY sort_clause ] ) [ FILTER ( WHERE qual ) ] and this patch makes FILTER a new reserved keyword, usable as a function or type, but not in other contexts, e.g., as a table name or alias. I'm not sure if that might be a problem for some people, but I can't see any way round it, since otherwise there would be no way for the parser to distinguish a filter clause from an alias expression. The feature appears to work per-spec, and the code and doc changes look reasonable. However, it looks a little light on regression tests, What tests do you think should be there that aren't? and so I think some necessary changes have been overlooked. In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: Per the spec, B) A filter clause shall not contain a query expression, a window function, or an outer reference. I'd be happy to see about adding one despite this, though. That they didn't figure out how doesn't mean we shouldn't eventually, ideally in time for 9.4. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Pglife and back-branch commits
Alvaro asked me to add the display of post minor-release commits to PgLife (http://pglife.momjian.us/). I have added a link to that information as a plus-sign after each minor release number, e.g. Stable: 9.2.4+, 9.1.9+, 9.0.13+, 8.4.17+. Hopefully this will be helpful in us scheduling minor releases. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
David Fetter escribió: On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: Per the spec, B) A filter clause shall not contain a query expression, a window function, or an outer reference. If this is not allowed, I think there should be a clearer error message. What Dean showed is an internal (not user-facing) error message. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: David Fetter escribió: On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: Per the spec, B) A filter clause shall not contain a query expression, a window function, or an outer reference. If this is not allowed, I think there should be a clearer error message. What Dean showed is an internal (not user-facing) error message. Folding to lower isn't allowed by the spec either, and then there's the matter of arrays... Before going to lots of trouble to figure out how to throw an error that says, only the spec and no further, I'd like to investigate how to make this work. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote: I think the question is whether this feature is really worth adding new reserved keywords for. I have a hard time saying we shouldn't support something that's part of the SQL standard, but personally, it's not something I've seen come up prior to this thread. What's the next step here? The feature sounds useful to me. If the grammar is unacceptable, does someone have an alternative idea, like using new function names instead of grammar? If so, what are reasonable names to use? Also, I think someone mentioned this already, but what about first_value() and last_value()? Shouldn't we do those at the same time? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
Actually, I think it would be cleaner to have a new state in pmState, namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT. When we're in this state, postmaster is only waiting for the timeout to expire; and when it does, it sends SIGKILL and exits. Pretty much the same you have, except that instead of checking AbortStartTime we check the pmState variable. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Config reload/restart preview
On Thu, Jun 20, 2013 at 9:33 AM, Magnus Hagander mag...@hagander.netwrote: On Thu, Jun 20, 2013 at 2:54 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? Yes, we should. +1 This would go well with something I started working on some time ago (but haven't actually gotten far on at all), which is the ability for pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload should also be able to tell you which parameters were changed, without having to go to the log. Obviously that's almost exactly the same feature. It could probably connect to the server and issue the SQL command to reload, and that one could probably get enhanced to return modified variable as NOTICE, or be run with the right client_min_message: SELECT pg_reload_conf(); The pg_ctl client would then have to know to display the messages sent back by the server. The problem with that is that now you must *always* have your system set up to allow the postgres user to log in in pg_hba.conf or it fails. But yes, one option would be to use SQL instead of opening a socket. Maybe that's a better idea - have pg_ctl try to use that if available, and if not send a regular signal and not try to collect the output. I started working on it yesterday after Thom proposed this idea internally at EDB. The discussion until now doesn't seem to be hostile to a SQL interface, so attached is a hack attempt, which hopefully will serve at least as a POC. A sample session is shown below, while changing a few values in postgresql.conf files. The central idea is to use the SIGHUP processing function to do the work for us and report potential changes via DEBUG2, instead of having to write a new parsing engine. The (GUC-nesting + PGC_S_TEST) is nice to have since it avoids the current session from adopting the values that are different in conf file. This approach is susceptible to the fact that the connected superuser may have its GUC values picked up from user/database/session level settings (ALTER USER/DATABASE .. SET ; or SET param TO val;). $ pgsql Expanded display is used automatically. psql (9.4devel) Type help for help. postgres=# show work_mem; work_mem -- 1MB (1 row) postgres=# set client_min_messages = debug2; SET postgres=# select pg_test_reload_conf(); DEBUG: parameter work_mem changed to 70MB pg_test_reload_conf - t (1 row) postgres=# show work_mem; work_mem -- 1MB (1 row) postgres=# select pg_test_reload_conf(); DEBUG: parameter shared_buffers cannot be changed without restarting the server DEBUG: configuration file /home/gurjeet/dev/pgdbuilds/report_guc_chanege_pre_reload/db/data/postgresql.conf contains errors; unaffected changes were applied pg_test_reload_conf - t (1 row) postgres=# select pg_test_reload_conf(); DEBUG: parameter log_min_messages removed from configuration file, reset to default pg_test_reload_conf - t (1 row) Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EnterpriseDB Inc. report_conf_changes_without_applying.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: David Fetter escribió: On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: Per the spec, B) A filter clause shall not contain a query expression, a window function, or an outer reference. If this is not allowed, I think there should be a clearer error message. What Dean showed is an internal (not user-facing) error message. Please find attached a patch which allows subqueries in the FILTER clause and adds regression testing for same. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index 68309ba..b289a3a 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -594,10 +594,13 @@ GROUP BY replaceable class=parameterexpression/replaceable [, ...] /para para -Aggregate functions, if any are used, are computed across all rows +In the absence of a literalFILTER/literal clause, +aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group (whereas without literalGROUP BY/literal, an aggregate produces a single value computed across all the selected rows). +When a literalFILTER/literal clause is present, only those +rows matching the FILTER clause are included. When literalGROUP BY/literal is present, it is not valid for the commandSELECT/command list expressions to refer to ungrouped columns except within aggregate functions or if the diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index b139212..c4d5f33 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1562,24 +1562,26 @@ sqrt(2) syntax of an aggregate expression is one of the following: synopsis -replaceableaggregate_name/replaceable (replaceableexpression/replaceable [ , ... ] [ replaceableorder_by_clause/replaceable ] ) -replaceableaggregate_name/replaceable (ALL replaceableexpression/replaceable [ , ... ] [ replaceableorder_by_clause/replaceable ] ) -replaceableaggregate_name/replaceable (DISTINCT replaceableexpression/replaceable [ , ... ] [ replaceableorder_by_clause/replaceable ] ) -replaceableaggregate_name/replaceable ( * ) +replaceableaggregate_name/replaceable (replaceableexpression/replaceable [ , ... ] [ replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE replaceablefilter_clause/replaceable ) ] +replaceableaggregate_name/replaceable (ALL replaceableexpression/replaceable [ , ... ] [ replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE replaceablefilter_clause/replaceable ) ] +replaceableaggregate_name/replaceable (DISTINCT replaceableexpression/replaceable [ , ... ] [ replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE replaceablefilter_clause/replaceable ) ] +replaceableaggregate_name/replaceable ( * ) [ FILTER ( WHERE replaceablefilter_clause/replaceable ) ] /synopsis where replaceableaggregate_name/replaceable is a previously defined aggregate (possibly qualified with a schema name), -replaceableexpression/replaceable is -any value expression that does not itself contain an aggregate -expression or a window function call, and -replaceableorder_by_clause/replaceable is a optional -literalORDER BY/ clause as described below. +replaceableexpression/replaceable is any value expression that +does not itself contain an aggregate expression or a window +function call, replaceableorder_by_clause/replaceable is a +optional literalORDER BY/ clause as described below. The +replaceableaggregate_name/replaceable can also be suffixed +with literalFILTER/literal as described below. /para para -The first form of aggregate expression invokes the aggregate -once for each input row. +The first form of aggregate expression invokes the aggregate once +for each input row, or when a FILTER clause is present, each row +matching same. The second form is the same as the first, since literalALL/literal is the default. The third form invokes the aggregate once for each distinct value @@ -1607,6 +1609,21 @@ sqrt(2) /para para +Adding a FILTER clause to an aggregate specifies which values of +the expression being aggregated to evaluate. For example: +programlisting +SELECT +count(*) AS unfiltered, +count(*) FILTER (WHERE i 5) AS filtered +FROM generate_series(1,10) AS s(i); + unfiltered | filtered ++-- + 10