Re: [HACKERS] Vacuum rate limit in KBps
On 19/01/12 17:39, Greg Smith wrote: On 1/19/12 1:10 PM, Robert Haas wrote: I have to say that I find that intensely counterintuitive. The current settings are not entirely easy to tune correctly, but at least they're easy to explain. If there's anyone out there who has run a larger PostgreSQL database and not at some point been extremely frustrated with how the current VACUUM settings are controlled, please speak up and say I'm wrong about this. I thought it was well understood the UI was near unusably bad, it just wasn't obvious what to do about it. We are frustrated but mostly our frustration is not about the somewhat inscrutable knobs but the inscrutable meters or lack there of. Postgres (auto or manual for that matter) vacuuming and analyzing is essentially a performance tuning problem without a good way to measure the current performance, the fact that the knobs to turn are confusing as well is secondary. What I think is missing is a clear way to know if you are vacuuming (and analyzing) enough, and how much you are paying for that. At the moment we are basically changing the knobs blindly based on some back of the envelope calculations and hearsay. Than sometimes month later we find out that eps we haven't been analyzing enough and that's why on that particular table the planner is now picking a bad query. What I want is that page http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html to start with Here is how you know if you are vacuuming enough... In an ideal world one would like some meter in a statistics table or similar that returns a percentage 100% means just enough 50% means you have to double 150% means 50% too much (e.g. wasted)... But I could do with a boolean as well. A complicated extension and the recommendation to install 3 different extensions would be better than what is there right now but only very barely. Of course a meter wouldn't tell you that if traffic doubled you would still keep up and for that you need a complicated calculation or (you just keep looking at the meter and adjust). But at the moment there is no such meter (at least I don't know of it) and that is the actual problem. My 2cents, Bene -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
On Sun, Jan 22, 2012 at 8:42 PM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jan 22, 2012 at 3:20 PM, Daniel Farina dan...@heroku.com wrote: A few anecdotes does not constitute evidence, but it does look like some people pay attention to any additional versioning foothold they can get. Sure, but just because some people do it doesn't make it a good idea. True, I did not mean to suggest that this is clearly the best mechanism, only to express support for the general idea that people are appreciative and willing to experiment with any way to make version management better, and people who use those features are *probably* going to try to use them correctly. It's up to us to make the right-thing easy, too, otherwise I fear we will see too many lousy version numbers creeping about in the wild. Dimitri's proposal was to neuter the pg_dump support that is the raison d'être of the extension mechanism. That's clearly necessary if you don't want to end up with an unreloadable database, but it begs the question (which no one's really answered AFAICT) of what good the extension mechanism is without that feature. Oh, no, non-reloadability is a really bad thing -- I'd say a pretty bad deal-breaker -- but as Tom wrote, it does seem like it should somehow be a tractable problem. Is it such a bad idea to store the literal text of the extension's pieces (control file and corresponding SQL program) in catalogs? I'm not sure if I understand why everyone is so interested in a special interaction with the file system in some way. By the same token, extensions can be dumped in the literal syntax -- even the ones that were installed from a file. There are certainly easier ways to remember a version number than building support for it into core. If people create their own versioning mechanisms, they can create something which is tailor-made for their particular requirements, rather than relying on decisions which we made in core that may or may not be right for them (e.g. the lack of version ordering, or even that we have versions rather than some more general type of control table). I understand the desire to avoid investing in something that is not what people want. However, in the interest of scoping the discussion to the inline extension support, I can't seem to understand the objection to supporting what is basically a different transport for precisely the same semantic operation as having to ssh into a machine and untar some files, except available without the bizarre side-channel of ssh and fie system mangling when one is loading trustable operators, itself a raft of usability issues if one wishes to enable more software reuse. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New replication mode: write
On Mon, Jan 23, 2012 at 4:58 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Jan 16, 2012 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Please add the Apply mode. OK, will do. Done. Attached is the updated version of the patch. I notice that the Apply mode isn't fully implemented. I had in mind that you would add the latch required to respond more quickly when only the Apply pointer has changed. Is there a reason not to use WaitLatchOrSocket() in WALReceiver? Or was there another reason for not implementing that? I agree that the feature you pointed is useful for the Apply mode. But I'm afraid that implementing that feature is not easy and would make the patch big and complicated, so I didn't implement the Apply mode first. To make the walreceiver call WaitLatchOrSocket(), we would need to merge it and libpq_select() into one function. But the former is the backend function and the latter is the frontend one. Now I have no good idea to merge them cleanly. If we send back the reply as soon as the Apply pointer is changed, I'm afraid quite lots of reply messages are sent frequently, which might cause performance problem. This is also one of the reasons why I didn't implement the quick-response feature. To address this problem, we might need to change the master so that it sends the Wait pointer to the standby, and change the standby so that it replies whenever the Apply pointer catches up with the Wait one. This can reduce the number of useless reply from the standby about the Apply pointer. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
* Ants Aasma: I had a run in with this. JDBC driver versions 9.0 with the default configuration resulted in silent data corruption. The fix was easy, but not having an useful error was what really bothered me. Same for the DBD::Pg driver. In this particular case, I knew that the change was coming and could push updated Java and Perl client libraries well before the server-side change hit our internal repository, but I really don't want to have to pay attention to such details. -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New replication mode: write
On Mon, Jan 23, 2012 at 9:02 AM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Jan 23, 2012 at 4:58 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Jan 16, 2012 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Please add the Apply mode. OK, will do. Done. Attached is the updated version of the patch. I notice that the Apply mode isn't fully implemented. I had in mind that you would add the latch required to respond more quickly when only the Apply pointer has changed. Is there a reason not to use WaitLatchOrSocket() in WALReceiver? Or was there another reason for not implementing that? I agree that the feature you pointed is useful for the Apply mode. But I'm afraid that implementing that feature is not easy and would make the patch big and complicated, so I didn't implement the Apply mode first. To make the walreceiver call WaitLatchOrSocket(), we would need to merge it and libpq_select() into one function. But the former is the backend function and the latter is the frontend one. Now I have no good idea to merge them cleanly. We can wait on the socket wherever it comes from. poll/select doesn't care how we got the socket. So we just need a common handler that calls either walreceiver/libpqwalreceiver function as required to handle the wakeup. If we send back the reply as soon as the Apply pointer is changed, I'm afraid quite lots of reply messages are sent frequently, which might cause performance problem. This is also one of the reasons why I didn't implement the quick-response feature. To address this problem, we might need to change the master so that it sends the Wait pointer to the standby, and change the standby so that it replies whenever the Apply pointer catches up with the Wait one. This can reduce the number of useless reply from the standby about the Apply pointer. We send back one reply per incoming message. The incoming messages don't know request state and checking that has a cost which I don't think is an appropriate payment since we only need this info when the link goes quiet. When the link goes quiet we still need to send replies if we have apply mode, but we only need to send apply messages if the lsn has changed because of a commit. That will considerably reduce the messages sent so I don't see a problem. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Sat, Jan 21, 2012 at 9:20 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, Thank you for reviewing this patch! Hitoshi Harada umi.tan...@gmail.com writes: The patch applies with one reject, which I could fix easily. The make check passed. Bitrot happens fast in this season… will produce another version of the patch. Table pg_catalog.pg_extension_feature Column | Type | Modifiers +--+--- extoid | oid | not null extfeature | name | not null Indexes: pg_extension_feature_index UNIQUE, btree (extoid, extfeature) pg_extension_feature_oid_index UNIQUE, btree (oid) * I'm not quit sure why pg_extension_feature_index needs extoid column. That allows managing features per extension: you need to know which extension is providing which feature to be able to solve dependencies. Do you mean you want UNIQUE constraint by this index? I found the usage is to search feature by (only) its name, so I wondered if extoid is not necessary. * I have a big question to add two-column catalog. I don't mind the actual number of columns, but if the table has only two columns, it implies the design may be bad. Only two column catalog other than this is pg_largeobject_metadata. We need each feature to be a full PostgreSQL object so that we can use the dependency tracking. That allows to manage DROP EXTENSION foo and cascade to extensions that depend on feature(s) provided by foo. I guess if we spend more time, we'll figure out what is feature actually, and then will see what kind of columns/attributes are needed to represent it. Although I agree we can add them later, again, this may imply the design is premature. (it's ok if i am the only person who thinks so) Next, some questions: - Why is the finer dependency needed? Do you have tangible example that struggles with the dependency granularity? I feel so good about the existing dependency on extension as an extension developer of several ones. The problem is not yet very apparent only because extensions are very new. The main thing we address with this patch is depending on a feature that appeared while developing an extension or that gets removed down the line. It allows to depend on features and avoid needing to compare version numbers and maintain a list of which version number is providing which feature. This feature has been asked by several extension users, beginning even before 9.1 got released. - What happens if DROP EXTENSION ... CASCADE? Does it work? It should, what happens when you try? :) I just tried DROP EXTENSION now, and found it broken :( db1=# create extension kmeans; CREATE EXTENSION db1=# drop extension kmeans; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. db1=# drop extension kmeans cascade; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. Am I missing something? I'm confused why this happens. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
Hitoshi Harada umi.tan...@gmail.com writes: pg_extension_feature_index UNIQUE, btree (extoid, extfeature) Do you mean you want UNIQUE constraint by this index? I found the usage is to search feature by (only) its name, so I wondered if extoid is not necessary. I guess you're right and that's something I've just left when I should have cleaned it. We need to find which extension is providing which feature, and we need feature names to be globally unique. I'll remove extoid from this index in the next revision on the patch. I'm not in a position to provide that next revision just now, that would happen before the end of the week though. I guess if we spend more time, we'll figure out what is feature actually, and then will see what kind of columns/attributes are needed to represent it. Although I agree we can add them later, again, this may imply the design is premature. (it's ok if i am the only person who thinks so) You might be right that a feature is more than just a unique name but as things are, that's their only useful property. - What happens if DROP EXTENSION ... CASCADE? Does it work? It should, what happens when you try? :) I just tried DROP EXTENSION now, and found it broken :( db1=# create extension kmeans; CREATE EXTENSION db1=# drop extension kmeans; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. Can you provide me the test case you've been using? That looks like a bug I need to fix, indeed (unless the problem lies in the test case, which would mean I need to tighten things some more). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New replication mode: write
On Mon, Jan 23, 2012 at 6:28 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Jan 23, 2012 at 9:02 AM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Jan 23, 2012 at 4:58 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Jan 16, 2012 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Please add the Apply mode. OK, will do. Done. Attached is the updated version of the patch. I notice that the Apply mode isn't fully implemented. I had in mind that you would add the latch required to respond more quickly when only the Apply pointer has changed. Is there a reason not to use WaitLatchOrSocket() in WALReceiver? Or was there another reason for not implementing that? I agree that the feature you pointed is useful for the Apply mode. But I'm afraid that implementing that feature is not easy and would make the patch big and complicated, so I didn't implement the Apply mode first. To make the walreceiver call WaitLatchOrSocket(), we would need to merge it and libpq_select() into one function. But the former is the backend function and the latter is the frontend one. Now I have no good idea to merge them cleanly. We can wait on the socket wherever it comes from. poll/select doesn't care how we got the socket. So we just need a common handler that calls either walreceiver/libpqwalreceiver function as required to handle the wakeup. I'm afraid I could not understand your idea. Could you explain it in more detail? If we send back the reply as soon as the Apply pointer is changed, I'm afraid quite lots of reply messages are sent frequently, which might cause performance problem. This is also one of the reasons why I didn't implement the quick-response feature. To address this problem, we might need to change the master so that it sends the Wait pointer to the standby, and change the standby so that it replies whenever the Apply pointer catches up with the Wait one. This can reduce the number of useless reply from the standby about the Apply pointer. We send back one reply per incoming message. The incoming messages don't know request state and checking that has a cost which I don't think is an appropriate payment since we only need this info when the link goes quiet. When the link goes quiet we still need to send replies if we have apply mode, but we only need to send apply messages if the lsn has changed because of a commit. That will considerably reduce the messages sent so I don't see a problem. You mean to change the meaning of apply_location? Currently it indicates the end + 1 of the last replayed WAL record, regardless of whether it's a commit record or not. So too many replies can be sent per incoming message because it might contain many WAL records. But you mean to change apply_location only when a commit record is replayed? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout
Hi, Any ideas on this? -- View this message in context: http://postgresql.1045698.n5.nabble.com/pgstat-wait-timeout-tp5078125p5165651.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On Fri, Jan 20, 2012 at 11:34 PM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, Jan 20, 2012 at 12:54 PM, Fujii Masao masao.fu...@gmail.com wrote: Thanks for the review! On Fri, Jan 20, 2012 at 8:15 PM, Simon Riggs si...@2ndquadrant.com wrote: I'm looking at this patch and wondering why we're doing so many press-ups to ensure full_page_writes parameter is on. This will still fail if you use a utility that removes the full page writes, but fail silently. I think it would be beneficial to explicitly check that all WAL records have full page writes actually attached to them until we achieve consistency. I agree that it's worth adding such a safeguard. That can be a self-contained feature, so I'll submit a separate patch for that, to keep each patch small. Maybe, but you mean do this now as well? Not sure I like silent errors. If many people think the patch is not acceptable without such a safeguard, I will do that right now. Otherwise, I'd like to take more time to do that, i.e., add it to 9.2dev Oepn Items. I've not come up with good idea. Ugly idea is to keep track of all replays of full_page_writes for every buffer pages (i.e., prepare 1-bit per buffer page table and set the specified bit to 1 when full_page_writes is applied), and then check whether full_page_writes has been already applied when replaying normal WAL record... Do you have any better idea? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Is it such a bad idea to store the literal text of the extension's pieces (control file and corresponding SQL program) in catalogs? I'm not sure if I understand why everyone is so interested in a special interaction with the file system in some way. By the same token, extensions can be dumped in the literal syntax -- even the ones that were installed from a file. There are certainly easier ways to remember a version number than building support for it into core. If people create their own versioning mechanisms, they can create something which is tailor-made for their particular requirements, rather than relying on decisions which we made in core that may or may not be right for them (e.g. the lack of version ordering, or even that we have versions rather than some more general type of control table). I understand the desire to avoid investing in something that is not what people want. However, in the interest of scoping the discussion to the inline extension support, I can't seem to understand the objection to supporting what is basically a different transport for precisely the same semantic operation as having to ssh into a machine and untar some files, except available without the bizarre side-channel of ssh and fie system mangling when one is loading trustable operators, itself a raft of usability issues if one wishes to enable more software reuse. or with adminpack, and without ssh, but still interaction with filesystem. Filesystem rw access is a pain for the DBA. There are hacks possible to get ride of that (but not completely: mount -o ro partitions for example...) I am in favor to be able to create extension directly in plain sql, without file creation or access or system administrators privileges. Why wouldn't we want that ?! -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Tom Lane t...@sss.pgh.pa.us writes: On reflection it seems like this patch is simply offering the wrong solution for the problem. I agree that it could be useful to install extensions without having direct access to the server's filesystem, but it doesn't seem to follow that we must lobotomize existing extension features in order to have that. I pointed out earlier that you could get such functionality via contrib/adminpack, though people not unreasonably complained that that was pretty ugly and low-level. But couldn't we define some SQL-level operations to allow installing extension control and script files? Yeah, that's what I was trying to do… Probably the worst issue with that is that in typical installations, the share/extension/ directory would be read-only to the server, and a lot of people might be uncomfortable with making it writable. Not sure whether we should consider inventing another place to keep SQL-command-installed extensions, or just say if you want this functionality you have to make share/extension/ writable. So I've been wondering about storing the script content in the catalogs, but you would have to store all the update scripts too and that's useless because you want to dump the current state of the system, which pg_dump is doing just fine. Back to using the file system on the server when handed the script content over the protocol, we could get there with a new GUC telling the server where to find and store “inline” extensions, right? extension_inline_directory = '/path/to/some/writable/place' Then creating an extension would look in both the system extension directory and the inline one, which covers dump and restore. Creating an inline extension means creating the .control and the .sql files in the extension_inline_directory, then running the current code. I can adapt the patch to this behavior this week. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] database schema quality survey postgresql pitfalls
Dear pgdevs, I've just completed the final version of a survey of database schema quality in open source software. The survey covers 512 projects which use MySQL and/or PostgreSQL for storing their data. Automatic analyses are performed by querying the information schema. Statistical validations are computed on the results. Here are some findings of specific interest to the list. About MySQL vs PostgreSQL usage: - MySQL is much more often used than PostgreSQL... not a surprise. - Even for projects which seems to support PostgreSQL, this support is often an afterthough and not necessarily functional. For instance, you may find MySQL-specific syntax in the PostgreSQL-specific script, which were clearly never tested. - Projects which use PostgreSQL statistically have a better quality compared to projects which use MySQL. However, the difference is mostly due to issues with MySQL. - Projects which use PostgreSQL are more often maintained than projects with MySQL. Some features or default behavior of PostgreSQL seem especially error-prone: - SERIAL attributes seem to be considered automatically as a primary key, so that the primary key declaration is often forgotten. This suggests that: * the documentation should insist on the potential issue. * a WARNING should be displayed when SERIAL is used without an associated PRIMARY KEY, or possibly UNIQUE. It should be very rare to desire a SERIAL which is not a PK, so this is worth a warning. - when loading a schema definition from an SQL script, the default behavior of psql is to ignore errors and go on. This lead to projects with failing declarations to be ignored because the ERROR is lost in the flow of WARNING and NOTICE. If a table is missing, the error will be detected because the project is not functional, but if a constraint is missing, it will just be lost. This occur in about 10% of pg projects! In order to avoid this behavior, one must do a \set ON_ERROR_STOP 1 at the beginning of the script. However, this is never done. Moreover, there is no simple way to trigger the safer behavior from the command line but quite a long -v ON_ERROR_STOP=1. It seems to me that: * the documentation should suggest to use an explicit stop on error setting in every script. * a psql -C foo.sql (check?) or equivalent short option would help? * The current client default verbosity is counter productive for quality. It should be reduced to WARNING and above, but should *not* include NOTICE which add a lot of noise ignored by the user and which hides more important messages. For instance, a PK implies an INDEX, a SERIAL implies a SEQUENCE, fine, but what is the point of telling it over and over? So I suggest to choose a default client_min_messages = warning. Finally, some issues where found and already reported some time ago about the implementation of the information schema by PostgreSQL. For instance, auto-generated constraint names are not unique as they should be, which makes having a standard information schema a little bit pointless, as querying it returns wrong results:-( The summary of the answer was do not use the information schema, or give unique names, which does not make much sense for me who is analysing existing projects, and as most constraint names are generated automatically by PostgreSQL. For those interested in more details about the survey, a preprint of the paper is available here : http://www.cri.ensmp.fr/classement/doc/A-488.pdf And the tool used for analysing the projects is available at : http://coelho.net/salix/ Comments are welcome. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Le 23 janvier 2012 11:53, Dimitri Fontaine dimi...@2ndquadrant.fr a écrit : Tom Lane t...@sss.pgh.pa.us writes: On reflection it seems like this patch is simply offering the wrong solution for the problem. I agree that it could be useful to install extensions without having direct access to the server's filesystem, but it doesn't seem to follow that we must lobotomize existing extension features in order to have that. I pointed out earlier that you could get such functionality via contrib/adminpack, though people not unreasonably complained that that was pretty ugly and low-level. But couldn't we define some SQL-level operations to allow installing extension control and script files? Yeah, that's what I was trying to do… Probably the worst issue with that is that in typical installations, the share/extension/ directory would be read-only to the server, and a lot of people might be uncomfortable with making it writable. Not sure whether we should consider inventing another place to keep SQL-command-installed extensions, or just say if you want this functionality you have to make share/extension/ writable. So I've been wondering about storing the script content in the catalogs, but you would have to store all the update scripts too and that's useless because you want to dump the current state of the system, which pg_dump is doing just fine. Back to using the file system on the server when handed the script content over the protocol, we could get there with a new GUC telling the server where to find and store “inline” extensions, right? extension_inline_directory = '/path/to/some/writable/place' Then creating an extension would look in both the system extension directory and the inline one, which covers dump and restore. Creating an inline extension means creating the .control and the .sql files in the extension_inline_directory, then running the current code. I can adapt the patch to this behavior this week. if we agree to have that per cluster, then it can be in the $pgdata/pg_extension or something similar... -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Mon, Jan 23, 2012 at 2:00 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hitoshi Harada umi.tan...@gmail.com writes: - What happens if DROP EXTENSION ... CASCADE? Does it work? It should, what happens when you try? :) I just tried DROP EXTENSION now, and found it broken :( db1=# create extension kmeans; CREATE EXTENSION db1=# drop extension kmeans; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. Can you provide me the test case you've been using? That looks like a bug I need to fix, indeed (unless the problem lies in the test case, which would mean I need to tighten things some more). The test case is just above; createdb db1 and create and drop an extension. The kmean extension is on pgxn. I tried my small test extension named ext1 which contains only one plpgsql function, and created it then dropped it, reproduced. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed dblink using alternate libpq tuple storage
On Sat, Jan 21, 2012 at 1:52 PM, Marc Mamin m.ma...@intershop.de wrote: c. Refine error handling of dblink.c. I think it preserves the previous behavior for column number mismatch and type conversion exception. Hello, I don't know if this cover following issue. I just mention it for the case you didn't notice it and would like to handle this rather cosmetic issue as well. http://archives.postgresql.org/pgsql-bugs/2011-08/msg00113.php It is not relevant to this thread, but seems good idea to implement indeed. It should be simple matter of creating handler that uses dblink_res_error() to report the notice. Perhaps you could create and submit the patch by yourself? For reference, here it the full flow in PL/Proxy: 1) PQsetNoticeReceiver: https://github.com/markokr/plproxy-dev/blob/master/src/execute.c#L422 2) handle_notice: https://github.com/markokr/plproxy-dev/blob/master/src/execute.c#L370 3) plproxy_remote_error: https://github.com/markokr/plproxy-dev/blob/master/src/main.c#L82 -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Thu, Jan 19, 2012 at 1:58 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Wed, Jan 18, 2012 at 1:11 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Sat, Jan 14, 2012 at 8:10 AM, Matthew Draper matt...@trebex.net wrote: I just remembered to make time to advance this from WIP to proposed patch this week... and then worked out I'm rudely dropping it into the last commitfest at the last minute. :/ The patch applies clean against master but compiles with warnings. functions.c: In function ‘prepare_sql_fn_parse_info’: functions.c:212: warning: unused variable ‘argnum’ functions.c: In function ‘sql_fn_post_column_ref’: functions.c:341: warning: implicit declaration of function ‘ParseFuncOrColumn’ functions.c:345: warning: assignment makes pointer from integer without a cast (Now it occurred to me that forgetting the #include parse_func.h might hit this breakage..., so I'll fix it here and continue to test, but if you'll fix it yourself, let me know) I fixed it here and it now works with my environment. The regression tests pass, the feature seems working as aimed, but it seems to me that it needs more test cases and documentation. For the tests, I believe at least we need ambiguous case given upthread, so that we can ensure to keep compatibility. For the document, it should describe the name resolution rule, as stated in the patch comment. Aside from them, I wondered at first what if the function is schema-qualified. Say, CREATE FUNCTION s.f(a int) RETURNS int AS $$ SELECT b FROM t WHERE a = s.f.a $$ LANGUAGE sql; It actually errors out, since function-name-qualified parameter only accepts function name without schema name, but it looked weird to me at first. No better idea from me at the moment, though. I mark this Waiting on Author for now. It's been a few days since my last comment, but are you sending a new patch? If there's no reply, I'll make it Returned with Feedback. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multithread Query Planner
On 2012-01-13 21:14, Frederico wrote: Hi folks. Is there any restriction in create and start threads inside Postgres? I'm trying to develop a multithread planner, and some times is raised a exception of access memory. I'm debugging the code to see if is a bug in the planner, but until now, I still not found. I tried to use the same memory context of root process and create a new context to each new thread, but doesn't worked. Any tips? Not sure if it is of any use to you, but the vldb paper 'Parallelizing Query Optimization' http://www.vldb.org/pvldb describes a experimental implementation in PostgreSQL. regards, Yeb -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL Restore process during recovery
On Fri, Jan 20, 2012 at 7:50 PM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Jan 20, 2012 at 7:38 PM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, Jan 20, 2012 at 3:43 AM, Fujii Masao masao.fu...@gmail.com wrote: Requested update Thanks! Will review. In StartChildProcess(), the code which emits an error when fork of walrestore fails is required. In reaper(), the following comment needs to be updated because an unexpected exit (including FATAL) is treated as a crash in the patch. /* * Was it the wal restore? If exit status is zero (normal) or one * (FATAL exit), we assume everything is all right just like normal * backends. */ if (pid == WalRestorePID) Why does walrestore need to be invoked even when restore_command is not specified? It seems to be useless. We invoke walreceiver only when primary_conninfo is specified now. Similarly we should invoke walrestore only when restore_command is specified? When I set up the file-based log-shipping environment using pg_standby, ran pgbench -i -s2, waited for walrestore to restore at least one WAL file, and created the trigger file, then I encounterd the following error in the standby. sby LOG: startup process requests 00010003 from archive trigger file found: smart failover sby LOG: startup process sees last file was 00010003 sby FATAL: could not rename file pg_xlog/RECOVERYXLOG to pg_xlog/00010003: No such file or directory sby LOG: startup process (PID 11079) exited with exit code 1 sby LOG: terminating any other active server processes When I set up streaming replication with setting restore_command, I got the following messages repeatedly. The WAL file name was always . sby1 LOG: walrestore checking for next file to restore sby1 LOG: restore of is already complete, so sleep In PostmasterStateMachine(), the following comment needs to mention WALRestore. * PM_WAIT_READONLY state ends when we have no regular backends that * have been started during recovery. We kill the startup and * walreceiver processes and transition to PM_WAIT_BACKENDS. Ideally, In walrestore.c, the following comments seem to be incorrect. At least an unexpected exit of WALRestore doesn't start a recovery cycle in the patch. * If the WAL restore exits unexpectedly, the postmaster treats that the same * as a backend crash: shared memory may be corrupted, so remaining backends * should be killed by SIGQUIT and then a recovery cycle started. In walrestore.c + * Main entry point for walrestore process + * + * This is invoked from BootstrapMain, which has already created the basic + * execution environment, but not enabled signals yet. BootstrapMain() doesn't exist, and it should be changed to AuxiliaryProcessMain(). This is not a fault of the patch. There are the same typos in bgwriter.c, walwriter.c and checkpointer.c In walrestore.c +* SIGUSR1 is presently unused; keep it spare in case someday we want this +* process to participate in ProcSignal signalling. The above comment is incorrect because SIGUSR1 is presently used. + /* +* From here on, elog(ERROR) should end with exit(1), not send +* control back to the sigsetjmp block above +*/ + ExitOnAnyError = true; The above is not required because sigsetjmp is not used in walrestore.c + /* Normal exit from the walwriter is here */ + proc_exit(0); /* done */ Typo: s/walwriter/walrestore I've not reviewed the patch enough yet. Will review the patch tomorrow again. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New replication mode: write
On Mon, Jan 23, 2012 at 10:03 AM, Fujii Masao masao.fu...@gmail.com wrote: To make the walreceiver call WaitLatchOrSocket(), we would need to merge it and libpq_select() into one function. But the former is the backend function and the latter is the frontend one. Now I have no good idea to merge them cleanly. We can wait on the socket wherever it comes from. poll/select doesn't care how we got the socket. So we just need a common handler that calls either walreceiver/libpqwalreceiver function as required to handle the wakeup. I'm afraid I could not understand your idea. Could you explain it in more detail? We either tell libpqwalreceiver about the latch, or we tell walreceiver about the socket used by libpqwalreceiver. In either case we share a pointer from one module to another. If we send back the reply as soon as the Apply pointer is changed, I'm afraid quite lots of reply messages are sent frequently, which might cause performance problem. This is also one of the reasons why I didn't implement the quick-response feature. To address this problem, we might need to change the master so that it sends the Wait pointer to the standby, and change the standby so that it replies whenever the Apply pointer catches up with the Wait one. This can reduce the number of useless reply from the standby about the Apply pointer. We send back one reply per incoming message. The incoming messages don't know request state and checking that has a cost which I don't think is an appropriate payment since we only need this info when the link goes quiet. When the link goes quiet we still need to send replies if we have apply mode, but we only need to send apply messages if the lsn has changed because of a commit. That will considerably reduce the messages sent so I don't see a problem. You mean to change the meaning of apply_location? Currently it indicates the end + 1 of the last replayed WAL record, regardless of whether it's a commit record or not. So too many replies can be sent per incoming message because it might contain many WAL records. But you mean to change apply_location only when a commit record is replayed? There is no change to the meaning of apply_location. The only change is that we send that message only when it has an updated value of committed lsn. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
On Mon, Jan 23, 2012 at 5:53 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Probably the worst issue with that is that in typical installations, the share/extension/ directory would be read-only to the server, and a lot of people might be uncomfortable with making it writable. Not sure whether we should consider inventing another place to keep SQL-command-installed extensions, or just say if you want this functionality you have to make share/extension/ writable. So I've been wondering about storing the script content in the catalogs, but you would have to store all the update scripts too and that's useless because you want to dump the current state of the system, which pg_dump is doing just fine. I'm not convinced that's useless. It would meet Dan's requirement to be able to manage the whole thing via the FE-BE protocol, and we could make the CREATE EXTENSION mechanism transparently search both the catalog and the filesystem when an extension is installed. I'm imagining that we'd create a catalog that would act as a sort of virtual directory - e.g. CREATE TABLE pg_extension_virtualdir (filename text, content text) which would be modifiable by the DBA and would be searched either before or after the filesystem itself. This catalog wouldn't be dumped by pg_dump, and there would be no changes to how extensions whose files are loaded from this catalog are dumped vs. those whose files are loaded from the filesystem. Rather, just as now, it would be the DBA's responsibility to make sure that the extensions needed to reload a given dump file are present on the new system - except now they'd have two choices where to put the related fies: on the file system, or in the per-database virtual directory. Back to using the file system on the server when handed the script content over the protocol, we could get there with a new GUC telling the server where to find and store “inline” extensions, right? extension_inline_directory = '/path/to/some/writable/place' Then creating an extension would look in both the system extension directory and the inline one, which covers dump and restore. Creating an inline extension means creating the .control and the .sql files in the extension_inline_directory, then running the current code. This is another possible approach, but it requires a bit more configuration, and we'd better think carefully about what a malicious non-superuser DBA can do by changing that GUC. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL Restore process during recovery
On Mon, Jan 23, 2012 at 12:23 PM, Fujii Masao masao.fu...@gmail.com wrote: I've not reviewed the patch enough yet. Will review the patch tomorrow again. Thanks very much. I'm sure that's enough to keep me busy a few days. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Removing freelist (was Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?)
On Mon, Jan 23, 2012 at 12:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jan 21, 2012 at 5:29 PM, Jim Nasby j...@nasby.net wrote: We should also look at having the freelist do something useful, instead of just dropping it completely. Unfortunately that's probably more work... That's kinda my feeling as well. The free list in its current form is pretty much useless, but I don't think we'll save much by getting rid of it, because that's just a single test. The expensive part of what we do while holding BufFreelistLock is, I think, iterating through buffers taking and releasing a spinlock on each one (!). Yeah ... spinlocks that, by definition, will be uncontested. What makes you think that they are uncontested? Or for that matter, that even an uncontested spinlock operation is cheap enough to do while holding a badly contended LWLock? So I think it would be advisable to prove rather than just assume that that's a problem. It's pretty trivial to prove that there is a very serious problem with BufFreelistLock. I'll admit I can't prove what the right fix is just yet, and certainly measurement is warranted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG-Strom - A GPU optimized asynchronous executor module
On Mon, Jan 23, 2012 at 1:38 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: What options are available to see rate of workloads of components within a particular query? I usually use oprofile, though I'm given to understand it's been superseded by a new tool called perf. I haven't had a chance to experiment with perf yet, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On Mon, Jan 23, 2012 at 10:29 AM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Jan 20, 2012 at 11:34 PM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, Jan 20, 2012 at 12:54 PM, Fujii Masao masao.fu...@gmail.com wrote: Thanks for the review! On Fri, Jan 20, 2012 at 8:15 PM, Simon Riggs si...@2ndquadrant.com wrote: I'm looking at this patch and wondering why we're doing so many press-ups to ensure full_page_writes parameter is on. This will still fail if you use a utility that removes the full page writes, but fail silently. I think it would be beneficial to explicitly check that all WAL records have full page writes actually attached to them until we achieve consistency. I agree that it's worth adding such a safeguard. That can be a self-contained feature, so I'll submit a separate patch for that, to keep each patch small. Maybe, but you mean do this now as well? Not sure I like silent errors. If many people think the patch is not acceptable without such a safeguard, I will do that right now. Otherwise, I'd like to take more time to do that, i.e., add it to 9.2dev Oepn Items. I've not come up with good idea. Ugly idea is to keep track of all replays of full_page_writes for every buffer pages (i.e., prepare 1-bit per buffer page table and set the specified bit to 1 when full_page_writes is applied), and then check whether full_page_writes has been already applied when replaying normal WAL record... Do you have any better idea? Not sure. I think the only possible bug here is one introduced by an outside utility. In that case, I don't think it should be the job of the backend to go too far to protect against such atypical error. So if we can't solve it fairly easily and with no overhead then I'd say lets skip it. We could easily introduce a bug here just by having faulty checking code. So lets add it to 9.2 open items as a non-priority item. I'll proceed to commit for this now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On Mon, Jan 23, 2012 at 5:29 AM, Fujii Masao masao.fu...@gmail.com wrote: If many people think the patch is not acceptable without such a safeguard, I will do that right now. That's my view. I think we ought to resolve this issue before commit, especially since it seems unclear that we know how to fix it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On Mon, Jan 23, 2012 at 8:11 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 23, 2012 at 5:29 AM, Fujii Masao masao.fu...@gmail.com wrote: If many people think the patch is not acceptable without such a safeguard, I will do that right now. That's my view. I think we ought to resolve this issue before commit, especially since it seems unclear that we know how to fix it. Actually, never mind. On reading this more carefully, I'm not too concerned about the possibility of people breaking it with pg_lesslog or similar. But it should be solid if you use only the functionality built into core. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Removing freelist (was Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?)
On Mon, Jan 23, 2012 at 1:06 PM, Robert Haas robertmh...@gmail.com wrote: It's pretty trivial to prove that there is a very serious problem with BufFreelistLock. I'll admit I can't prove what the right fix is just yet, and certainly measurement is warranted. I agree there is a problem with BufFreelistLock (so please share your results with Heikki, who seems not to). As a result, I've published patches which reduce contention on that lock in various ways, all of which seem valid to me. There are lots of things we could have done for 9.2 but didn't, yet we have some things that did get done on the table right now so it would be useful to push those through immediately or at least defer discussion on other things until we get back around to this. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Robert Haas robertmh...@gmail.com writes: virtual directory - e.g. CREATE TABLE pg_extension_virtualdir (filename text, content text) which would be modifiable by the DBA and would be searched either before or after the filesystem itself. This catalog wouldn't be dumped by pg_dump, and there would be no changes to how extensions whose files are loaded from this catalog are dumped vs. those whose files are loaded from the filesystem. Rather, just as That's the thing I don't like in this approach. Maybe it's just me but the primary intention on working on extension was to make dump and restore do the right thing all by itself. Now if we have “inline” (SQL only) extensions, the right thing happens to be very different from when you're dealing with contrib like ones, namely I would want the script to be dumped. extension_inline_directory = '/path/to/some/writable/place' This is another possible approach, but it requires a bit more configuration, and we'd better think carefully about what a malicious non-superuser DBA can do by changing that GUC. I think Cédric nailed it down upthread, proposing that we just use a PGDATA sub directory called 'pg_extension'. In fact, that would need to be a per-database sub directory. Then there's nothing to setup, nothing to abuse. Also remember that we're limiting this feature to SQL only extensions (because we don't want to be loading our .so from anywhere in the system and forcing them into a place owned by root is giving confidence, IIUC). With SQL only extensions, it's all non-superuser land anyway. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG-Strom - A GPU optimized asynchronous executor module
On Sun, Jan 22, 2012 at 3:48 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I tried to implement a fdw module that is designed to utilize GPU devices to execute qualifiers of sequential-scan on foreign tables managed by this module. It was named PG-Strom, and the following wikipage gives a brief overview of this module. http://wiki.postgresql.org/wiki/PGStrom In our measurement, it achieves about x10 times faster on sequential-scan with complex- qualifiers, of course, it quite depends on type of workloads. Very cool. Someone's been busy. I see you've introduced 3 new features here at same time * GPU access * column store * compiled WHERE clauses It would be useful to see if we can determine which of those gives the most benefit and whether other directions emerge. Also, the query you mention is probably the best performing query you can come up with. It looks like a GIS query, yet isn't. Would it be possible to run tests on the TPC-H suite and do a full comparison of strengths/weaknesses so we can understand the breadth of applicability of the techniques. This is a very interesting line of discussion, but please can we hold off further posts about it until after the CF is over? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] basic pgbench runs with various performance-related patches
There was finally some time available on Nate Boley's server, which he has been kind enough to make highly available for performance testing throughout this cycle, and I got a chance to run some benchmarks against a bunch of the perfomance-related patches in the current CommitFest. Specifically, I did my usual pgbench tests: 3 runs at scale factor 100, with various client counts. I realize that this is not the only or even most interesting thing to test, but I felt it would be useful to have this information as a baseline before proceeding to more complicated testing. I have another set of tests running now with a significantly different configuration that will hopefully provide some useful feedback on some of the things this test fails to capture, and will post the results of the tests (and the details of the test configuration) as soon as those results are in. For the most part, I only tested each patch individually, but in one case I also tested two patches together (buffreelistlock-reduction-v1 with freelist-ok-v2). Results are the median of three five-minute test runs, with one exception: buffreelistlock-reduction-v1 crapped out during one of the test runs with the following errors, so I've shown the results for both of the successful runs (though I'm not sure how relevant the numbers are given the errors, as I expect there is a bug here somewhere): log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read block 0 in file base/20024/11780: read only 0 of 8192 bytes log.ws.buffreelistlock-reduction-v1.1.100.300:CONTEXT: automatic analyze of table rhaas.public.pgbench_branches log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read block 0 in file base/20024/11780: read only 0 of 8192 bytes log.ws.buffreelistlock-reduction-v1.1.100.300:CONTEXT: automatic analyze of table rhaas.public.pgbench_tellers log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read block 0 in file base/20024/11780: read only 0 of 8192 bytes log.ws.buffreelistlock-reduction-v1.1.100.300:CONTEXT: automatic analyze of table rhaas.pg_catalog.pg_database log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read block 0 in file base/20024/11780: read only 0 of 8192 bytes log.ws.buffreelistlock-reduction-v1.1.100.300:STATEMENT: vacuum analyze pgbench_branches log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read block 0 in file base/20024/11780: read only 0 of 8192 bytes log.ws.buffreelistlock-reduction-v1.1.100.300:STATEMENT: select count(*) from pgbench_branches Just for grins, I ran the same set of tests against REL9_1_STABLE, and the results of those tests are also included below. It's worth grinning about: on this test, at 32 clients, 9.2devel (as of commit 4f42b546fd87a80be30c53a0f2c897acb826ad52, on which all of these tests are based) is 25% faster on permanent tables, 109% faster on unlogged tables, and 474% faster on a SELECT-only test. Here's the test configuration: shared_buffers = 8GB maintenance_work_mem = 1GB synchronous_commit = off checkpoint_segments = 300 checkpoint_timeout = 15min checkpoint_completion_target = 0.9 wal_writer_delay = 20ms And here are the results. For everything against master, I've also included the percentage speedup or slowdown vs. the same test run against master. Many of these numbers are likely not statistically significant, though some clearly are. ** pgbench, permanent tables, scale factor 100, 300 s ** 1 master 686.038059 8 master 4425.79 16 master 7808.389490 24 master 13276.472813 32 master 11920.691220 80 master 12560.803169 1 REL9_1_STABLE 627.879523 -8.5% 8 REL9_1_STABLE 4188.731855 -5.4% 16 REL9_1_STABLE 7433.309556 -4.8% 24 REL9_1_STABLE 10496.411773 -20.9% 32 REL9_1_STABLE 9547.804833 -19.9% 80 REL9_1_STABLE 7197.655050 -42.7% 1 background-clean-slru-v2 629.518668 -8.2% 8 background-clean-slru-v2 4794.662182 +8.3% 16 background-clean-slru-v2 8062.151120 +3.2% 24 background-clean-slru-v2 13275.834722 -0.0% 32 background-clean-slru-v2 12024.410625 +0.9% 80 background-clean-slru-v2 12113.589954 -3.6% 1 buffreelistlock-reduction-v1 512.828482 -25.2% 8 buffreelistlock-reduction-v1 4765.576805 +7.7% 16 buffreelistlock-reduction-v1 8030.477792 +2.8% 24 buffreelistlock-reduction-v1 13118.481248 -1.2% 32 buffreelistlock-reduction-v1 11895.847998 -0.2% 80 buffreelistlock-reduction-v1 12015.291045 -4.3% 1 buffreelistlock-reduction-v1-freelist-ok-v2 621.960997 -9.3% 8 buffreelistlock-reduction-v1-freelist-ok-v2 4650.200642 +5.1% 16 buffreelistlock-reduction-v1-freelist-ok-v2 7999.167629 +2.4% 24 buffreelistlock-reduction-v1-freelist-ok-v2 13070.123153 -1.6% 32 buffreelistlock-reduction-v1-freelist-ok-v2 11808.986473 -0.9% 80 buffreelistlock-reduction-v1-freelist-ok-v2 12136.960028 -3.4% 1 freelist-ok-v2 629.832419 -8.2% 8 freelist-ok-v2 4800.267011 +8.5% 16 freelist-ok-v2 8018.571815 +2.7% 24 freelist-ok-v2 13122.167158 -1.2% 32 freelist-ok-v2 12004.261737 +0.7% 80 freelist-ok-v2 12188.211067 -3.0% 1
Re: [HACKERS] Inline Extension
On Mon, Jan 23, 2012 at 8:25 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: virtual directory - e.g. CREATE TABLE pg_extension_virtualdir (filename text, content text) which would be modifiable by the DBA and would be searched either before or after the filesystem itself. This catalog wouldn't be dumped by pg_dump, and there would be no changes to how extensions whose files are loaded from this catalog are dumped vs. those whose files are loaded from the filesystem. Rather, just as That's the thing I don't like in this approach. Maybe it's just me but the primary intention on working on extension was to make dump and restore do the right thing all by itself. Well, fair enough: there's no accounting for taste. You could make my solution work with pg_dump in a fully automated fashion if you dumped out the virtual directory contents before dumping any CREATE EXTENSION statements, but I'm not going to get up on my soapbox and say that's the world's best design, so if you don't like it, fine! I am pretty concerned that we find a design that does not involve pg_dump needing to dump out the extension contents, though: that seems to me to be missing the point of having extensions in the first place. Now if we have “inline” (SQL only) extensions, the right thing happens to be very different from when you're dealing with contrib like ones, namely I would want the script to be dumped. extension_inline_directory = '/path/to/some/writable/place' This is another possible approach, but it requires a bit more configuration, and we'd better think carefully about what a malicious non-superuser DBA can do by changing that GUC. I think Cédric nailed it down upthread, proposing that we just use a PGDATA sub directory called 'pg_extension'. In fact, that would need to be a per-database sub directory. Then there's nothing to setup, nothing to abuse. Hmm, that might have something going for it. It seems comparatively easy to implement, and it also seems to do a pretty good job hiding the complexity under the hood where users don't have to worry about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 4:03 AM, Florian Weimer fwei...@bfk.de wrote: * Ants Aasma: I had a run in with this. JDBC driver versions 9.0 with the default configuration resulted in silent data corruption. The fix was easy, but not having an useful error was what really bothered me. Same for the DBD::Pg driver. In this particular case, I knew that the change was coming and could push updated Java and Perl client libraries well before the server-side change hit our internal repository, but I really don't want to have to pay attention to such details. But if we *don't* turn this on by default, then chances are very good that it will get much less use. That doesn't seem good either. If it's important enough to do it at all, then IMHO it's important enough for it to be turned on by default. We have never made any guarantee that the binary format won't change from release to release. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On Mon, Jan 23, 2012 at 1:53 PM, Robert Haas robertmh...@gmail.com wrote: Results are the median of three five-minute test runs checkpoint_timeout = 15min Test duration is important for tests that don't relate to pure contention reduction, which is every patch apart from XLogInsert. We've discussed that before, so not sure what value you assign to these results. Very little, is my view, so I'm a little disappointed to see this post and the associated comments. I'm very happy to see that your personal work has resulted in gains and these results are valid tests of that work, IMHO. If you only measure throughput you're only measuring half of what users care about. We've not yet seen any tests that confirm that other important issues have not been made worse. Before commenting on individual patches its clear that the tests you've run aren't even designed to highlight the BufFreelistLock contention that is present in different configs, so that alone is sufficient to throw most of this away. On particular patches * background-clean-slru-v2 related very directly to reducing the response time spikes you showed us in your last set of results. Why not repeat those same tests?? * removebufmgrfreelist-v1 related to the impact of dropping tables/index/databases, so given the variability of the results, that at least shows it has no effect in the general case. And here are the results. For everything against master, I've also included the percentage speedup or slowdown vs. the same test run against master. Many of these numbers are likely not statistically significant, though some clearly are. with one exception: buffreelistlock-reduction-v1 crapped out during one of the test runs with the following errors That patch comes with the proviso, stated in comments: We didn't get the lock, but read the value anyway on the assumption that reading this value is atomic. So we seem to have proved that reading it without the lock isn't safe. The remaining patch you tested was withdrawn and not submitted to the CF. Sigh. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
* Robert Haas: In this particular case, I knew that the change was coming and could push updated Java and Perl client libraries well before the server-side change hit our internal repository, but I really don't want to have to pay attention to such details. But if we *don't* turn this on by default, then chances are very good that it will get much less use. That doesn't seem good either. If it's important enough to do it at all, then IMHO it's important enough for it to be turned on by default. We have never made any guarantee that the binary format won't change from release to release. The problem here are libpq-style drivers which expose the binary format to the application. The Java driver doesn't do that, but the Perl driver does. (However, both transparently decode BYTEA values received in text format, which led to the compatibility issue.) If you've version negotiation and you don't expose the binary format, then all clients can use the most efficient format automatically. If I understand things correctly, this is where the JDBC driver is heading (that is, automatic use of binary format). -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG-Strom - A GPU optimized asynchronous executor module
2012/1/23 Simon Riggs si...@2ndquadrant.com: On Sun, Jan 22, 2012 at 3:48 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I tried to implement a fdw module that is designed to utilize GPU devices to execute qualifiers of sequential-scan on foreign tables managed by this module. It was named PG-Strom, and the following wikipage gives a brief overview of this module. http://wiki.postgresql.org/wiki/PGStrom In our measurement, it achieves about x10 times faster on sequential-scan with complex- qualifiers, of course, it quite depends on type of workloads. Very cool. Someone's been busy. I see you've introduced 3 new features here at same time * GPU access * column store * compiled WHERE clauses It would be useful to see if we can determine which of those gives the most benefit and whether other directions emerge. Also, the query you mention is probably the best performing query you can come up with. It looks like a GIS query, yet isn't. Would it be possible to run tests on the TPC-H suite and do a full comparison of strengths/weaknesses so we can understand the breadth of applicability of the techniques. DBT-2 is a good alternative, even though TPC-H is expensive to run. This is a very interesting line of discussion, but please can we hold off further posts about it until after the CF is over? Yep, I agree. We should handle existing patches first, then new features of v9.3. I'll back to review the pgsql_fdw. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG-Strom - A GPU optimized asynchronous executor module
On Mon, Jan 23, 2012 at 2:49 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Also, the query you mention is probably the best performing query you can come up with. It looks like a GIS query, yet isn't. Would it be possible to run tests on the TPC-H suite and do a full comparison of strengths/weaknesses so we can understand the breadth of applicability of the techniques. DBT-2 is a good alternative, even though TPC-H is expensive to run. DBT-2 is an OLTP test, not a DSS/DW test. I'm not interested in the full TPC-H test, just a query by query comparison of how well this stacks up. If there are other tests that are also balanced/representative, I'd like to see those also. Just so we can see the benefit envelope. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Sun, Jan 22, 2012 at 11:47 PM, Mikko Tiihonen mikko.tiiho...@nitorcreations.com wrote: * introduced a new GUC variable array_output copying the current bytea_output type, with values full (old value) and smallfixed (new default) * added documentation for the new GUC variable If this variable changes protocol-level layout and is user-settable, shouldn't it be GUC_REPORT? Now that I think about it, same applies to bytea_output? You could say the problem does not appear if the clients always accepts server default. But how can the client know the default? If the client is required to do SHOW before it can talk to server then that seems to hint those vars should be GUC_REPORT. Same story when clients are always expected to set the vars to their preferred values. Then you get clients with different settings on one server. This breaks transaction-pooling setups (pgbouncer). Again, such protocol-changing tunables should be GUC_REPORT. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: ALTER TABLE IF EXISTS
On Tue, Jan 3, 2012 at 7:49 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I change a patch and now ALTER TABLE, ALTER INDEX, ALTER SEQUENCE and ALTER VIEW has IF EXISTS clause Patch no longer applies. Pls update. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Robert Haas robertmh...@gmail.com writes: I am pretty concerned that we find a design that does not involve pg_dump needing to dump out the extension contents, though: that seems to me to be missing the point of having extensions in the first place. I was just trying to explain where I'm coming from, I'm not wedded to the idea though, all the more when I think that we're reaching a much better one. I think Cédric nailed it down upthread, proposing that we just use a PGDATA sub directory called 'pg_extension'. In fact, that would need to be a per-database sub directory. Then there's nothing to setup, nothing to abuse. Hmm, that might have something going for it. It seems comparatively easy to implement, and it also seems to do a pretty good job hiding the complexity under the hood where users don't have to worry about it. And then basebackup and pg_upgrade would just work, and for dump and restore we still need to find something not violating the POLA. I think that would mean offering a backend function that list all files from a given extension and their content, including the control files, and a query that stores that output for only “inline” extensions. The content of the query result is formatted as a series of create extension and alter extension update (in the right order) in the dump file so that it just transparently re-creates the files for you on the new databse. Or do you still want to insist that dump/restore shouldn't care about any extension, inline or not, and so you're given the responsibility to do the exact same thing yourself on the client side? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On Mon, Jan 23, 2012 at 9:31 AM, Simon Riggs si...@2ndquadrant.com wrote: Test duration is important for tests that don't relate to pure contention reduction, which is every patch apart from XLogInsert. Yes, I know. I already said that I was working on more tests to address other use cases. I'm very happy to see that your personal work has resulted in gains and these results are valid tests of that work, IMHO. If you only measure throughput you're only measuring half of what users care about. We've not yet seen any tests that confirm that other important issues have not been made worse. I personally think throughput is awfully important, but clearly latency matters as well, and that is why *even as we speak* I am running more tests. If there are other issues with which you are concerned besides latency and throughput, please say what they are. On particular patches * background-clean-slru-v2 related very directly to reducing the response time spikes you showed us in your last set of results. Why not repeat those same tests?? I'm working on it. Actually, I'm attempting to improve my previous test configuration by making some alterations per some of your previous suggestions. I plan to post the results of those tests once I have run them. * removebufmgrfreelist-v1 related to the impact of dropping tables/index/databases, so given the variability of the results, that at least shows it has no effect in the general case. I think it needs some tests with a larger scale factor before drawing any general conclusions, since this test, as you mentioned above, doesn't involve much buffer eviction. As it turns out, I am working on running such tests. That patch comes with the proviso, stated in comments: We didn't get the lock, but read the value anyway on the assumption that reading this value is atomic. So we seem to have proved that reading it without the lock isn't safe. I am not sure what's going on with that patch, but clearly something isn't working right. I don't know whether it's that or something else, but it does look like there's a bug. The remaining patch you tested was withdrawn and not submitted to the CF. Oh. Which one was that? I thought all of these were in play. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: ALTER TABLE IF EXISTS
On Tue, Jan 3, 2012 at 2:49 PM, Pavel Stehule pavel.steh...@gmail.com wrote: jup, we can continue in enhancing step by step. I change a patch and now ALTER TABLE, ALTER INDEX, ALTER SEQUENCE and ALTER VIEW has IF EXISTS clause ALTER FOREIGN TABLE should be parallel as well, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
On Mon, Jan 23, 2012 at 10:04 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: And then basebackup and pg_upgrade would just work, and for dump and restore we still need to find something not violating the POLA. I think that would mean offering a backend function that list all files from a given extension and their content, including the control files, and a query that stores that output for only “inline” extensions. The content of the query result is formatted as a series of create extension and alter extension update (in the right order) in the dump file so that it just transparently re-creates the files for you on the new databse. Hmm. But CREATE EXTENSION / ALTER EXTENSION doesn't seem right, because the files in the directory correspond to *available* extensions, not already-created ones. We need some way of dumping and restoring the files themselves, not the extension that can be created from them. I suspect internal functions (pg_whatever) make more sense than new SQL syntax, since this is really only to make pg_dump happy. Or do you still want to insist that dump/restore shouldn't care about any extension, inline or not, and so you're given the responsibility to do the exact same thing yourself on the client side? How about adding a new pg_dump option to suppress this part of the dump? It seems to me that there will be people who want to do that; for example, it might be that all (or some, or one) of the extensions that were installed this way on the old server are installed globally on the new server. We need some way to cope with that. Having a new pg_dump option to suppress this output is not terribly granular but maybe it'd be enough for round one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Robert Haas robertmh...@gmail.com writes: Hmm. But CREATE EXTENSION / ALTER EXTENSION doesn't seem right, because the files in the directory correspond to *available* extensions, not already-created ones. We need some way of dumping and I would have limited the dump query to only known installed extensions, right. The update scripts are still needed because with inline extensions you typically never see a 1.2 script but a 1.0 then 1.0--1.1 and then a 1.1--1.2. restoring the files themselves, not the extension that can be created from them. I suspect internal functions (pg_whatever) make more sense than new SQL syntax, since this is really only to make pg_dump happy. That could well be, yes, but what would this function do that the commands are not doing? I'm ok not to invent specific syntax to solve that problem, I just think that we should already have all we need :) Or do you still want to insist that dump/restore shouldn't care about any extension, inline or not, and so you're given the responsibility to do the exact same thing yourself on the client side? How about adding a new pg_dump option to suppress this part of the dump? Makes sense, indeed. Well one could of course manually filter the dump object list too, of course… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 9:36 AM, Florian Weimer fwei...@bfk.de wrote: * Robert Haas: In this particular case, I knew that the change was coming and could push updated Java and Perl client libraries well before the server-side change hit our internal repository, but I really don't want to have to pay attention to such details. But if we *don't* turn this on by default, then chances are very good that it will get much less use. That doesn't seem good either. If it's important enough to do it at all, then IMHO it's important enough for it to be turned on by default. We have never made any guarantee that the binary format won't change from release to release. The problem here are libpq-style drivers which expose the binary format to the application. The Java driver doesn't do that, but the Perl driver does. (However, both transparently decode BYTEA values received in text format, which led to the compatibility issue.) I can see where that could cause some headaches... but it seems to me that if we take that concern seriously, it brings us right back to square one. If breaking the binary format causes too much pain to actually go do it, then we shouldn't change it until we're breaking everything else anyway (i.e. new protocol version, as Tom suggested). Even if you have version negotiation, it doesn't help that much in this scenario. Drivers that know about the new protocol can theoretically negotiate upward, but if they expose the binary format to their users in turn then it's a can of worms: they then need to negotiate with their client applications to see what version the client is OK with, and then turn around and negotiate with the server to that level and not higher. That strikes me as a lot of work for a lot of people given the amount of improvement we can expect out of this one change. I think it's also worth noting that a GUC is not really a good mechanism for negotiating the protocol version. GUCs can be changed by the local administrator on a server-wide (or per-user, or per-database, or per-user-and-database) basis, and that's not what you want for a protocol negotiation. Every client will have to query the server version and then decide whether to try to change it, and handle errors if that fails. All of that is going to add start-up overhead to connections, which will need to make multiple round trips to get everything straightened out. I think if the only way to make this change without excessive pain is by having a good mechanism for negotiating the protocol version, then we need to defer it to a future release. Now is not the time to invent entirely new mechanisms, especially around just one example. I'm OK with breaking it and adding a GUC for backward-compatibility, but so far the other suggestions strike me as being not convincingly well-enough engineered to stand the test of time, and whatever we do here is going to be with us for quite a while. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
On Mon, Jan 23, 2012 at 10:26 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: Hmm. But CREATE EXTENSION / ALTER EXTENSION doesn't seem right, because the files in the directory correspond to *available* extensions, not already-created ones. We need some way of dumping and I would have limited the dump query to only known installed extensions, right. The update scripts are still needed because with inline extensions you typically never see a 1.2 script but a 1.0 then 1.0--1.1 and then a 1.1--1.2. Hmm, I don't think I like that design. I think we should view this as a way to embed the SQL and control files needed by the extension in the server, rather than a separate thing called an inline extension. If pg_dump is going to dump those files, it ought to dump them all, not just some subset of them. restoring the files themselves, not the extension that can be created from them. I suspect internal functions (pg_whatever) make more sense than new SQL syntax, since this is really only to make pg_dump happy. That could well be, yes, but what would this function do that the commands are not doing? I'm ok not to invent specific syntax to solve that problem, I just think that we should already have all we need :) I was thinking of something like pg_write_extension_file('foobar.control', 'content goes here'); Or do you still want to insist that dump/restore shouldn't care about any extension, inline or not, and so you're given the responsibility to do the exact same thing yourself on the client side? How about adding a new pg_dump option to suppress this part of the dump? Makes sense, indeed. Well one could of course manually filter the dump object list too, of course… True. I guess that's another reason why a global flag to shut it all off is probably sufficient, but I'm still in favor of having at least that much. I think the average user is much more likely to find pg_dump --skip-whatever than they are to understand how to do dump object filtering correctly, and even the advanced user may appreciate the shortcut on occasion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On Mon, Jan 23, 2012 at 3:09 PM, Robert Haas robertmh...@gmail.com wrote: I'm working on it. Good, thanks for the update. The remaining patch you tested was withdrawn and not submitted to the CF. Oh. Which one was that? I thought all of these were in play. freelist_ok was a prototype for testing/discussion, which contained an arguable heuristic. I guess that means its also in play, but I wasn't thinking we'd be able to assemble clear evidence for 9.2. The other patches have clearer and specific roles without heuristics (mostly), so are at least viable for 9.2, though still requiring agreement. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Robert Haas robertmh...@gmail.com writes: Hmm, I don't think I like that design. I think we should view this as a way to embed the SQL and control files needed by the extension in the server, rather than a separate thing called an inline extension. If pg_dump is going to dump those files, it ought to dump them all, not just some subset of them. Ok, but then, what about .so files? Wouldn't it make sense to be able to ship also the executable modules needed, and if not, why not? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On Mon, Jan 23, 2012 at 10:35 AM, Simon Riggs si...@2ndquadrant.com wrote: freelist_ok was a prototype for testing/discussion, which contained an arguable heuristic. I guess that means its also in play, but I wasn't thinking we'd be able to assemble clear evidence for 9.2. OK, that one is still in the test runs I am doing right now, but I will drop it from future batches to save time and energy that can be better spent on things we have a chance of getting done for 9.2. The other patches have clearer and specific roles without heuristics (mostly), so are at least viable for 9.2, though still requiring agreement. I think we must also drop removebufmgrfreelist-v1 from consideration, unless you want to go over it some more and try to figure out a fix for whatever caused it to crap out on these tests. IIUC, that corresponds to this CommitFest entry: https://commitfest.postgresql.org/action/patch_view?id=744 Whatever is wrong must be something that happens pretty darn infrequently, since it only happened on one test run out of 54, which also means that if you do want to pursue that one we'll have to go over it pretty darn carefully to make sure that we've fixed that issue and don't have any others. I have to admit my personal preference is for postponing that one to 9.3 anyway, since there are some related issues I'd like to experiment with. But let me know how you'd like to proceed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
On Mon, Jan 23, 2012 at 10:46 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: Hmm, I don't think I like that design. I think we should view this as a way to embed the SQL and control files needed by the extension in the server, rather than a separate thing called an inline extension. If pg_dump is going to dump those files, it ought to dump them all, not just some subset of them. Ok, but then, what about .so files? Wouldn't it make sense to be able to ship also the executable modules needed, and if not, why not? Sure, that would be as useful as any other part of this feature. We'd have to think carefully about how to make it secure, though: obviously it's no good to let a non-superuser database owner install compiled C code that gets loaded into the backend! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Mon, Jan 23, 2012 at 9:59 AM, Marko Kreen mark...@gmail.com wrote: On Sun, Jan 22, 2012 at 11:47 PM, Mikko Tiihonen mikko.tiiho...@nitorcreations.com wrote: * introduced a new GUC variable array_output copying the current bytea_output type, with values full (old value) and smallfixed (new default) * added documentation for the new GUC variable If this variable changes protocol-level layout and is user-settable, shouldn't it be GUC_REPORT? Now that I think about it, same applies to bytea_output? You could say the problem does not appear if the clients always accepts server default. But how can the client know the default? If the client is required to do SHOW before it can talk to server then that seems to hint those vars should be GUC_REPORT. Same story when clients are always expected to set the vars to their preferred values. Then you get clients with different settings on one server. This breaks transaction-pooling setups (pgbouncer). Again, such protocol-changing tunables should be GUC_REPORT. Probably so. But I think we need not introduce quite so many new threads on this patch. This is, I think, at least thread #4, and that's making the discussion hard to follow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Mon, Jan 23, 2012 at 01:21:20AM +0400, Alexander Korotkov wrote: Updated patch is attached. I've updated comment of mcelem_array_contained_selec with more detailed description of probability distribution assumption. Also, I found that rest behavious should be better described by Poisson distribution, relevant changes were made. Thanks. That makes more of the math clear to me. I do not follow all of it, but I feel that the comments now have enough information that I could go about doing so. + /* Take care about events with low probabilities. */ + if (rest DEFAULT_CONTAIN_SEL) + { Why the change from rest 0 to this in the latest version? + /* emit some statistics for debug purposes */ + elog(DEBUG3, array: target # mces = %d, bucket width = %d, + # elements = %llu, hashtable size = %d, usable entries = %d, + num_mcelem, bucket_width, element_no, i, track_len); That should be UINT64_FMT. (I introduced that error in v0.10.) I've attached a new version that includes the UINT64_FMT fix, some edits of your newest comments, and a rerun of pgindent on the new files. I see no other issues precluding commit, so I am marking the patch Ready for Committer. If I made any of the comments worse, please post another update. Thanks, nm arrayanalyze-0.13.patch.gz Description: application/gunzip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Removing freelist (was Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?)
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 23, 2012 at 12:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: The expensive part of what we do while holding BufFreelistLock is, I think, iterating through buffers taking and releasing a spinlock on each one (!). Yeah ... spinlocks that, by definition, will be uncontested. What makes you think that they are uncontested? Ah, never mind. I was thinking that we'd only be touching buffers that were *on* the freelist, but of course this is incorrect. The real problem there is that BufFreelistLock is also used to protect the clock sweep pointer. I think basically we gotta find a way to allow multiple backends to run clock sweeps concurrently. Or else fix things so that the freelist never (well, hardly ever) runs dry. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Removing freelist (was Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?)
On Mon, Jan 23, 2012 at 11:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 23, 2012 at 12:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: The expensive part of what we do while holding BufFreelistLock is, I think, iterating through buffers taking and releasing a spinlock on each one (!). Yeah ... spinlocks that, by definition, will be uncontested. What makes you think that they are uncontested? Ah, never mind. I was thinking that we'd only be touching buffers that were *on* the freelist, but of course this is incorrect. The real problem there is that BufFreelistLock is also used to protect the clock sweep pointer. I think basically we gotta find a way to allow multiple backends to run clock sweeps concurrently. Or else fix things so that the freelist never (well, hardly ever) runs dry. I'd come to the same conclusion myself. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 10:34:12AM -0500, Robert Haas wrote: On Mon, Jan 23, 2012 at 9:36 AM, Florian Weimer fwei...@bfk.de wrote: * Robert Haas: In this particular case, I knew that the change was coming and could push updated Java and Perl client libraries well before the server-side change hit our internal repository, but I really don't want to have to pay attention to such details. But if we *don't* turn this on by default, then chances are very good that it will get much less use. ?That doesn't seem good either. ?If it's important enough to do it at all, then IMHO it's important enough for it to be turned on by default. ?We have never made any guarantee that the binary format won't change from release to release. The problem here are libpq-style drivers which expose the binary format to the application. ?The Java driver doesn't do that, but the Perl driver does. ?(However, both transparently decode BYTEA values received in text format, which led to the compatibility issue.) I can see where that could cause some headaches... but it seems to me that if we take that concern seriously, it brings us right back to square one. If breaking the binary format causes too much pain to actually go do it, then we shouldn't change it until we're breaking everything else anyway (i.e. new protocol version, as Tom suggested). As I said upthread, and you appeared to agree, the protocol is independent of individual data type send/recv formats. Even if we were already adding protocol v4 to PostgreSQL 9.2, having array_send() change its behavior in response to the active protocol version would be wrong. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Robert Haas robertmh...@gmail.com writes: Ok, but then, what about .so files? Wouldn't it make sense to be able to ship also the executable modules needed, and if not, why not? Sure, that would be as useful as any other part of this feature. We'd have to think carefully about how to make it secure, though: obviously it's no good to let a non-superuser database owner install compiled C code that gets loaded into the backend! The big problem is that .so are installed in directories where the system postgres user usually is not granted permissions to write, that's root business. It already has been asked before about allowing PostgreSQL to load .so from a non-root location, and I think that to be consistent with your view of the world it would be good to add that feature. Then there's no such beast as an “inline” extension so much as a way to install an extension from the protocol, without file system level access to the production server. This would need to be superuser only, of course. That opens up the possibility to ship the modules to any standby server too: the situation now is unfortunate in that create extension hstore on the primary then using it in some indexes means the standby has no means to use those index until you fix it and install the same extension files there. The module itself could be accepted as a bytea value and written at the right place, where the server knows to load it. Now you can dump/restore any extension fully, and we can even ship any extension in the WAL stream (a new message is needed though). The only remaining issue would be the default policy as far as including or avoiding an extension in the dump is concerned, and I would be ok with a default of including none as of now and a pg_dump switch that you can repeat to include whichever extension you wish to. The effect is to transport the files and install them in the expected place on the target server where you restore. And now we really process all extensions the same and you can update inline an extension that you installed the 9.1 way, and vice versa. Don't let me speak about extension distribution facilities just now :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 23, 2012 at 9:59 AM, Marko Kreen mark...@gmail.com wrote: Now that I think about it, same applies to bytea_output? Probably so. But I think we need not introduce quite so many new threads on this patch. This is, I think, at least thread #4, and that's making the discussion hard to follow. Well, this is independent of the proposed patch, so I think a separate thread is okay. The question is shouldn't bytea_output be marked GUC_REPORT? I think that probably it should be, though I wonder whether we're not too late. Clients relying on it to be transmitted are not going to work with existing 9.0 or 9.1 releases; so maybe changing it to be reported going forward would just make things worse. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 5:34 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 23, 2012 at 9:36 AM, Florian Weimer fwei...@bfk.de wrote: * Robert Haas: In this particular case, I knew that the change was coming and could push updated Java and Perl client libraries well before the server-side change hit our internal repository, but I really don't want to have to pay attention to such details. But if we *don't* turn this on by default, then chances are very good that it will get much less use. That doesn't seem good either. If it's important enough to do it at all, then IMHO it's important enough for it to be turned on by default. We have never made any guarantee that the binary format won't change from release to release. The problem here are libpq-style drivers which expose the binary format to the application. The Java driver doesn't do that, but the Perl driver does. (However, both transparently decode BYTEA values received in text format, which led to the compatibility issue.) I can see where that could cause some headaches... but it seems to me that if we take that concern seriously, it brings us right back to square one. If breaking the binary format causes too much pain to actually go do it, then we shouldn't change it until we're breaking everything else anyway (i.e. new protocol version, as Tom suggested). My suggestion - please avoid per-session-protocol. Either something is Postgres version-dependent or it can be toggled/tracked per request. That means any data can either be passed through, or you need to understand formats of Postgres version X.Y. This kind of hints at per-request gimme-formats-from-version-x.y flag for ExecuteV4 packet. Or some equivalent of it. Anything that cannot be processed without tracking per-session state over whole stack (poolers/client frameworks) is major pain to maintain. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Mon, Jan 23, 2012 at 11:20:52AM -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 23, 2012 at 9:59 AM, Marko Kreen mark...@gmail.com wrote: Now that I think about it, same applies to bytea_output? Probably so. But I think we need not introduce quite so many new threads on this patch. This is, I think, at least thread #4, and that's making the discussion hard to follow. Well, this is independent of the proposed patch, so I think a separate thread is okay. The question is shouldn't bytea_output be marked GUC_REPORT? I think that probably it should be, though I wonder whether we're not too late. Clients relying on it to be transmitted are not going to work with existing 9.0 or 9.1 releases; so maybe changing it to be reported going forward would just make things worse. Well, in a complex setup it can change under you at will, but as clients can process the data without knowing the server state, maybe it's not a big problem. (Unless there are old clients in the mix...) Perhaps we can leave it as-is? But this leaves the question of future policy for data format change in protocol. Note I'm talking about both text and binary formats here together. Although we could have different policy for them. Also note that any kind of per-session flag is basically a GUC. Question 1 - how does client know about which format data is? 1) new format is detectable from lossy GUC 2) new format is detectable from GUC_REPORT 3) new format is detectable from Postgres version 4) new format was requested in query (V4 proto) 5) new format is detectable from data (\x in bytea) 1. obviously does not work. 2. works, but requires changes across all infrastructure. 3. works and is simple, but painful. 4. is good, but in the future 5. is good, now Question 2 - how does client request new format? 1) Postgres new version forces it. 2) GUC_REPORT + non-detectable data 3) Lossy GUC + autodetectable data 4) GUC_REPORT + autodetectable data 5) Per-request data (V4 proto) 1. is painful 2. is painful - all infra components need to know about the GUC. 34. are both ugly and non-maintanable in long term. Only difference is that with 3) the infrastructure can give slight guarantees that it does not change under client. 4. seems good... Btw, it does not seems that per-request metainfo change requires major version. It just client can send extra metainfo packet before bind+execute, if it knows server version is good enough. For older servers it can simply skip the extra info. [Oh yeah, that requires data format is autodetectable, always.] My conclusions: 1. Any change in data format should be compatible with old data. IOW - if client requested new data format, it should always accept old format too. 2. Can we postpone minor data format changes on the wire until there is proper way for clients to request on-the-wire formats? -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Mon, Jan 23, 2012 at 7:58 PM, Noah Misch n...@leadboat.com wrote: + /* Take care about events with low probabilities. */ + if (rest DEFAULT_CONTAIN_SEL) + { Why the change from rest 0 to this in the latest version? Ealier addition of rest distribution require O(m) time. Now there is a more accurate and proved estimate, but it takes O(m^2) time.It doesn't make general assymptotical time worse, but it significant. That's why I decided to skip for low values of rest which don't change distribution significantly. + /* emit some statistics for debug purposes */ + elog(DEBUG3, array: target # mces = %d, bucket width = %d, + # elements = %llu, hashtable size = %d, usable entries = %d, + num_mcelem, bucket_width, element_no, i, track_len); That should be UINT64_FMT. (I introduced that error in v0.10.) I've attached a new version that includes the UINT64_FMT fix, some edits of your newest comments, and a rerun of pgindent on the new files. I see no other issues precluding commit, so I am marking the patch Ready for Committer. Great! If I made any of the comments worse, please post another update. Changes looks reasonable for me. Thanks! -- With best regards, Alexander Korotkov.
Re: [HACKERS] Multithread Query Planner
On Fri, Jan 13, 2012 at 2:29 PM, Christopher Browne cbbro...@gmail.com wrote: On Fri, Jan 13, 2012 at 3:14 PM, Frederico zepf...@gmail.com wrote: Hi folks. Is there any restriction in create and start threads inside Postgres? I'm trying to develop a multithread planner, and some times is raised a exception of access memory. I'm debugging the code to see if is a bug in the planner, but until now, I still not found. I tried to use the same memory context of root process and create a new context to each new thread, but doesn't worked. Any tips? Yes, don't try to use threads. http://wiki.postgresql.org/wiki/Developer_FAQ#Why_don.27t_you_use_threads.2C_raw_devices.2C_async-I.2FO.2C_.3Cinsert_your_favorite_wizz-bang_feature_here.3E.3F ... threads are not currently used instead of multiple processes for backends because: Yes, but OP is proposing to use multiple threads inside the forked execution process. That's a completely different beast. Many other databases support parallel execution of a single query and it might very well be better/easier to do that with threads. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
Marko Kreen mark...@gmail.com writes: [ bytea_output doesn't need to be GUC_REPORT because format is autodetectable ] Fair enough. Anyway we're really about two years too late to revisit that. Btw, it does not seems that per-request metainfo change requires major version. It just client can send extra metainfo packet before bind+execute, if it knows server version is good enough. That is nonsense. You're changing the protocol, and then saying that clients should consult the server version instead of the protocol version to know what to do. 2. Can we postpone minor data format changes on the wire until there is proper way for clients to request on-the-wire formats? I think that people are coming around to that position, ie, we need a well-engineered solution to the versioning problem *first*, and should not accept incompatible minor improvements until we have that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Measuring relation free space
Excerpts from Noah Misch's message of vie ene 20 22:33:30 -0300 2012: On Fri, Jan 20, 2012 at 07:03:22PM -0500, Jaime Casanova wrote: On Wed, Jan 18, 2012 at 7:01 PM, Noah Misch n...@leadboat.com wrote: On Wed, Jan 18, 2012 at 09:46:20AM -0500, Jaime Casanova wrote: ignoring all non-leaf pages still gives a considerable difference between pgstattuple and relation_free_space() pgstattuple() counts the single B-tree meta page as always-full, while relation_free_space() skips it for all purposes. ?For tiny indexes, that can shift the percentage dramatically. ok, i will reformulate the question. why is fine ignoring non-leaf pages but is not fine to ignore the meta page? pgstattuple() figures the free_percent by adding up all space available to hold tuples and dividing that by the simple size of the relation. Non-leaf pages and the meta page get identical treatment: both never hold tuples, so they do not contribute to the free space. Hm. Leaf pages hold as much tuples as non-leaf pages, no? I mean for each page element there's a value and a CTID. In non-leaf those CTIDs point to other index pages, one level down the tree; in leaf pages they point to the heap. The metapage is special in that it is not used to store any user data, just a pointer to the root page. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Jan 23, 2012, at 2:49 PM, Tom Lane wrote: Marko Kreen mark...@gmail.com writes: [ bytea_output doesn't need to be GUC_REPORT because format is autodetectable ] Fair enough. Anyway we're really about two years too late to revisit that. Btw, it does not seems that per-request metainfo change requires major version. It just client can send extra metainfo packet before bind+execute, if it knows server version is good enough. That is nonsense. You're changing the protocol, and then saying that clients should consult the server version instead of the protocol version to know what to do. 2. Can we postpone minor data format changes on the wire until there is proper way for clients to request on-the-wire formats? I think that people are coming around to that position, ie, we need a well-engineered solution to the versioning problem *first*, and should not accept incompatible minor improvements until we have that. One simple way clients could detect the binary encoding at startup would be to pass known test parameters and match against the returned values. If the client cannot match the response, then it should choose the text representation. Alternatively, the 16-bit int in the Bind and RowDescription messages could be incremented to indicate a new format and then clients can specify the highest version of the binary format which they support. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On sön, 2012-01-22 at 11:43 -0500, Andrew Dunstan wrote: Actually, given recent discussion I think that test should just be removed from json.c. We don't actually have any test that the code point is valid (e.g. that it doesn't refer to an unallocated code point). We don't do that elsewhere either - the unicode_to_utf8() function the scanner uses to turn \u escapes into utf8 doesn't look for unallocated code points. I'm not sure how much other validation we should do - for example on correct use of surrogate pairs. We do check the correctness of surrogate pairs elsewhere. Search for surrogate in scan.l; should be easy to copy. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Client Messages
On Wed, Jan 18, 2012 at 9:19 AM, Jim Mlodgenski jimm...@gmail.com wrote: On Wed, Jan 18, 2012 at 3:08 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 18.01.2012 07:49, Fujii Masao wrote: On Fri, Jan 6, 2012 at 1:38 AM, Jim Mlodgenskijimm...@gmail.com wrote: I have a need to send banner messages to a psql client that I can set on the server and will be displayed on any psql client that connects to the database. This would be mostly used as an additional indicator to which database you are connecting, but could also be used by people to force their users to see an security message when connecting to the database. The attached patch will allow you to execute ALTER DATABASE postgres SET client_message=E'\nBEWARE: You are connecting to a production database. If you do anything to\n bring this server down, you will be destroyed by your supreme overlord.\n\n'; And then when you connect to psql, you will see: [e3@workstation bin]$ ./psql -U user1 postgres psql (9.2devel) BEWARE: You are connecting to a production database. If you do anything to bring this server down, you will be destroyed by your supreme overlord. Type help for help. postgres= Any feedback is welcome. Adding new GUC parameter only for the purpose of warning psql users seems overkill to me. Basically we try to reduce the number of GUC parameters to make a configuration easier to a user, so I don't think that it's good idea to add new GUC for such a small benefit. It seems quite useful to me... Instead, how about using .psqlrc file and writing a warning message in it by using \echo command? That's not the same thing at all. Each client would need to put the warning in that file, and you'd get it regardless of the database you connect to. Anyway, I found one problem in the patch. The patch defines client_message as PGC_USERSET parameter, which means that any psql can falsify a warning message, e.g., by setting the environment variable PGOPTIONS to -c client_message=hoge. This seems to be something to avoid from security point of view. I don't think that's a problem, it's just a free-form message to display. But it also doesn't seem very useful to have it PGC_USERSET: if it's only displayed at connect time, there's no point in changing it after connecting. Should we make it PGC_BACKEND? The only security problem that I can think of is a malicious server (man-in-the-middle perhaps), that sends a banner that confuses Docs for PQparameterStatus() needs adjustment, now that client_message is also one of the settings automatically reported to the client. I'll add the docs for that.. The placement of the banner in psql looks currently like this: $ psql postgres psql (9.2devel) Hello world! Type help for help. or postgres=# \c postgres Hello world! You are now connected to database postgres as user heikki. Are we happy with that? I think it would be better to print the banner just before the prompt: I like that better. I'll make that change as well. Here is the revised patch based on the feedback. psql (9.2devel) Type help for help. Hello world! postgres=# \c postgres You are now connected to database postgres as user heikki. Hello world! postgres=# Should we prefix the banner with something that makes it clear that it's a message coming from the server? Something like: I don't think the default prefix adds much for the user. If the administrator wants to let the user know that its from the server, he can add it to the message. psql (9.2devel) Type help for help. Notice from server: Hello world! postgres=# \c postgres You are now connected to database postgres as user heikki. Notice from server: Hello world! postgres=# -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e55b503..04bc671 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -5324,6 +5324,19 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' /listitem /varlistentry + varlistentry id=guc-client-message xreflabel=client_message + termvarnameclient_message/varname (typestring/type)/term + indexterm + primaryvarnameclient_message/ configuration parameter/primary + /indexterm + listitem + para +The varnameclient_message/varname can be any string that will be +displayed to the user in the banner of psql. + /para + /listitem + /varlistentry + /variablelist /sect2 /sect1 diff --git
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 11:15 AM, Noah Misch n...@leadboat.com wrote: As I said upthread, and you appeared to agree, the protocol is independent of individual data type send/recv formats. Even if we were already adding protocol v4 to PostgreSQL 9.2, having array_send() change its behavior in response to the active protocol version would be wrong. Sure, it's not directly related to the active protocol version, but my point is that we need to decide whether we need an autonegotiation mechanism or some kind, or not. We can reasonably decide that: 1. It's OK to change the binary format incompatibly, and clients must be prepared to deal with that, possibly assisted by a backward-compatibility GUC. -or else- 2. It's not OK to change the binary format incompatibility, and therefore we need some kind of negotiation mechanism to make sure that we give the new and improved format only to clients that can cope with it. Not being responsible from the maintenance of any PostgreSQL drivers whatsoever, I don't have a strong feeling about which of these is the case, and I'd like us to hear from the people who do. What I do think is that we can't look at a GUC (however named) as a poor man's replacement for #2. It's not gonna work, or at least not very well. If the default is off, then clients have to go through a round-trip to turn it on, which means that every client will have to decide whether to pay the price of turning it on (and possibly not recoup the investment) or whether to leave it off (and possibly get hosed if many large arrays that would have met the criteria for the optimization are transferred). Furthermore, if we turn it on by default, drivers and applications that haven't been updated will deliver corrupted results. None of that sounds very good to me. If there are enough dependencies on the details of the binary format that we can't afford to just change it, then we'd better have a cheap and reliable way for clients to negotiate upward - and a GUC is not going to give us that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Mon, Jan 23, 2012 at 2:00 PM, A.M. age...@themactionfaction.com wrote: One simple way clients could detect the binary encoding at startup would be to pass known test parameters and match against the returned values. If the client cannot match the response, then it should choose the text representation. Alternatively, the 16-bit int in the Bind and RowDescription messages could be incremented to indicate a new format and then clients can specify the highest version of the binary format which they support. Prefer the version. But why send this over and over with each bind? Wouldn't you negotiate that when connecting? Most likely, optionally, doing as much as you can from the server version? Personally I'm not really enthusiastic about a solution that adds a non-avoidable penalty to all queries. Also, a small nit: this problem is not specific to binary formats. Text formats can and do change, albeit rarely, with predictable headaches for the client. I see no reason to deal with text/binary differently. The only difference between text/binary wire formats in my eyes are that the text formats are documented. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL return value in after triggers
Hello Peter I checked code, and I don't think so this is good. A design of optional NULL is going to inconsistent syntax. RETURN (OLD, NEW, NULL, /* nothing */) is not consistent But my main argument is not intuitive behave of BEFORE triggers after this change. When somebody write BEFORE trigger function like: BEGIN RAISE NOTICE '%', NEW.x; RETURN; END; then don't expect so all rows will be lost. Preferred default return value for BEFORE INSERT UPDATE trigger should be NEW, and for DELETE trigger should be OLD - not NULL. And because we cannot to distinct between BEFORE and AFTER trigger in parser, I propose don't change current behave. Current behave is not too friendly - but is consistent with simple rules. Regards Pavel 2012/1/2 Peter Eisentraut pete...@gmx.net: On mån, 2011-02-28 at 19:07 +0200, Peter Eisentraut wrote: PL/pgSQL trigger functions currently require a value to be returned, even though that value is not used for anything in case of a trigger fired AFTER. I was wondering if we could relax that. It would make things a bit more robust and produce clearer PL/pgSQL code. The specific case I'm concerned about is that a trigger function could accidentally be run in a BEFORE trigger even though it was not meant for that. It is common practice that trigger functions for AFTER triggers return NULL, which would have unpleasant effects if used in a BEFORE trigger. I think it is very uncommon to have the same function usable for BEFORE and AFTER triggers, so it would be valuable to have coding support specifically for AFTER triggers. We could just allow RETURN without argument, or perhaps no RETURN at all. Here is a patch for that. One thing that I'm concerned about with this is that it treats a plain RETURN in a BEFORE trigger as RETURN NULL, whereas arguably it should be an error. I haven't found a good way to handle that yet, but I'll keep looking. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 3:06 PM, Robert Haas robertmh...@gmail.com wrote: Not being responsible from the maintenance of any PostgreSQL drivers whatsoever, I don't have a strong feeling about which of these is the case, and I'd like us to hear from the people who do. I'm just gonna come right out and say that GUC-based solutions are not the way -- bytea encoding change is a perfect example of that. IMSNHO, there's only two plausible paths ahead: 1) document the binary formats and continue with 'go at your own risk' 2) full auto-negotiation for all wire formats (text and binary). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
On Mon, Jan 23, 2012 at 11:06 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 23, 2012 at 11:15 AM, Noah Misch n...@leadboat.com wrote: As I said upthread, and you appeared to agree, the protocol is independent of individual data type send/recv formats. Even if we were already adding protocol v4 to PostgreSQL 9.2, having array_send() change its behavior in response to the active protocol version would be wrong. Sure, it's not directly related to the active protocol version, but my point is that we need to decide whether we need an autonegotiation mechanism or some kind, or not. We can reasonably decide that: 1. It's OK to change the binary format incompatibly, and clients must be prepared to deal with that, possibly assisted by a backward-compatibility GUC. -or else- 2. It's not OK to change the binary format incompatibility, and therefore we need some kind of negotiation mechanism to make sure that we give the new and improved format only to clients that can cope with it. Not being responsible from the maintenance of any PostgreSQL drivers whatsoever, I don't have a strong feeling about which of these is the case, and I'd like us to hear from the people who do. What I do think is that we can't look at a GUC (however named) as a poor man's replacement for #2. It's not gonna work, or at least not very well. If the default is off, then clients have to go through a round-trip to turn it on, which means that every client will have to decide whether to pay the price of turning it on (and possibly not recoup the investment) or whether to leave it off (and possibly get hosed if many large arrays that would have met the criteria for the optimization are transferred). Furthermore, if we turn it on by default, drivers and applications that haven't been updated will deliver corrupted results. None of that sounds very good to me. If there are enough dependencies on the details of the binary format that we can't afford to just change it, then we'd better have a cheap and reliable way for clients to negotiate upward - and a GUC is not going to give us that. Trying to solve it with startup-time negotiation, or some GUC is a dead end, in the sense that it will actively discourage any kind of pass-through protocol processing. If simple protocol processor (~pgbouncer) needs to know about some GUC, and tune it on-the-fly, it's not a payload feature, it's a protocol feature. Instead this should be solved with extending the per-query text/bin flag to include version info and maybe also type groups. Some way of saying numerics:9.0-bin, the-rest:8.4-text. The groups would also solve the problem with no way of turning on binary formats on result columns safely. The flags could be text (~http accept), or maybe integers for more efficiency (group code: group format ver). -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Jan 23, 2012, at 4:45 PM, Merlin Moncure wrote: On Mon, Jan 23, 2012 at 2:00 PM, A.M. age...@themactionfaction.com wrote: One simple way clients could detect the binary encoding at startup would be to pass known test parameters and match against the returned values. If the client cannot match the response, then it should choose the text representation. Alternatively, the 16-bit int in the Bind and RowDescription messages could be incremented to indicate a new format and then clients can specify the highest version of the binary format which they support. Prefer the version. But why send this over and over with each bind? Wouldn't you negotiate that when connecting? Most likely, optionally, doing as much as you can from the server version? Personally I'm not really enthusiastic about a solution that adds a non-avoidable penalty to all queries. Also, a small nit: this problem is not specific to binary formats. Text formats can and do change, albeit rarely, with predictable headaches for the client. I see no reason to deal with text/binary differently. The only difference between text/binary wire formats in my eyes are that the text formats are documented. merlin In terms of backwards compatibility (to support the widest range of clients), wouldn't it make sense to freeze each format option? That way, an updated text version could also assume a new int16 format identifier. The client would simply pass its preferred format. This could also allow for multiple in-flight formats; for example, if a client anticipates a large in-bound bytea column, it could specify format X which indicates the server should use gzip the result before sending. That same format may not be preferable on a different request. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On Sun, Jan 15, 2012 at 10:08 AM, Andrew Dunstan and...@dunslane.net wrote: Here's an update that adds row_to_json, plus a bit more cleanup. why not call all these functions 'to_json' and overload them? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On 01/23/2012 05:21 PM, Merlin Moncure wrote: On Sun, Jan 15, 2012 at 10:08 AM, Andrew Dunstanand...@dunslane.net wrote: Here's an update that adds row_to_json, plus a bit more cleanup. why not call all these functions 'to_json' and overload them? I don't honestly feel that advances clarity much. And we might want to overload each at some stage with options that are specific to the datum type. We have various foo_to_xml() functions now. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
2012/1/23 Merlin Moncure mmonc...@gmail.com: On Sun, Jan 15, 2012 at 10:08 AM, Andrew Dunstan and...@dunslane.net wrote: Here's an update that adds row_to_json, plus a bit more cleanup. why not call all these functions 'to_json' and overload them? -1 older proposal is more consistent with xml functions Pavel merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Mon, Jan 23, 2012 at 4:12 PM, A.M. age...@themactionfaction.com wrote: On Jan 23, 2012, at 4:45 PM, Merlin Moncure wrote: Prefer the version. But why send this over and over with each bind? Wouldn't you negotiate that when connecting? Most likely, optionally, doing as much as you can from the server version? Personally I'm not really enthusiastic about a solution that adds a non-avoidable penalty to all queries. In terms of backwards compatibility (to support the widest range of clients), wouldn't it make sense to freeze each format option? That way, an updated text version could also assume a new int16 format identifier. The client would simply pass its preferred format. This could also allow for multiple in-flight formats; for example, if a client anticipates a large in-bound bytea column, it could specify format X which indicates the server should use gzip the result before sending. That same format may not be preferable on a different request. hm. well, I'd say that you're much better off if you can hold to the principle that newer versions of the format are always better and should both be used if the application and the server agree. Using your example, since you can already do something like: select zlib_compress(byteacol) from foo; I'm not sure that you're getting anything with that user facing complexity. The only realistic case I can see for explicit control of wire formats chosen is to defend your application from format changes in the server when upgrading the server and/or libpq. This isn't a let's get better compression problem, this is I upgraded my database and my application broke problem. Fixing this problem in non documentation fashion is going to require a full protocol change, period. It's the only way we can safely get all the various players (libpq, jdbc, etc) on the same page without breaking/recompiling millions of lines of old code that is currently in production. The new protocol should *require* at minimum the application, not libpq, to explicitly send the version of the database it was coded against. That's just not getting sent now, and without that information there's no realistic way to prevent application breakage -- depending on libpq versions is useless since it can be upgraded and there's always jdbc to deal with. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Measuring relation free space
On Mon, Jan 23, 2012 at 04:56:24PM -0300, Alvaro Herrera wrote: Excerpts from Noah Misch's message of vie ene 20 22:33:30 -0300 2012: pgstattuple() figures the free_percent by adding up all space available to hold tuples and dividing that by the simple size of the relation. Non-leaf pages and the meta page get identical treatment: both never hold tuples, so they do not contribute to the free space. Hm. Leaf pages hold as much tuples as non-leaf pages, no? I mean for each page element there's a value and a CTID. In non-leaf those CTIDs point to other index pages, one level down the tree; in leaf pages they point to the heap. That distinction seemed important when I sent my last message, but now I agree that it's largely irrelevant for free space purposes. If someone feels like doing it, +1 for making pgstattuple() count non-leaf free space. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On Mon, Jan 23, 2012 at 3:49 PM, Robert Haas robertmh...@gmail.com wrote: The other patches have clearer and specific roles without heuristics (mostly), so are at least viable for 9.2, though still requiring agreement. I think we must also drop removebufmgrfreelist-v1 from consideration, ... I think you misidentify the patch. Earlier you said it that buffreelistlock-reduction-v1 crapped out and I already said that the assumption in the code clearly doesn't hold, implying the patch was dropped. The removebufmgrfreelist and its alternate patch is still valid, with applicability to special cases. I've written another patch to assist with testing/assessment of the problems, attached. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c index 3e62448..36b0160 100644 --- a/src/backend/storage/buffer/freelist.c +++ b/src/backend/storage/buffer/freelist.c @@ -17,6 +17,7 @@ #include storage/buf_internals.h #include storage/bufmgr.h +#include utils/timestamp.h /* @@ -41,6 +42,21 @@ typedef struct */ uint32 completePasses; /* Complete cycles of the clock sweep */ uint32 numBufferAllocs; /* Buffers allocated since last reset */ + + /* + * Wait Statistics + */ + long waitBufferAllocSecs; + int waitBufferAllocUSecs; + int waitBufferAlloc; + + long waitBufferFreeSecs; + int waitBufferFreeUSecs; + int waitBufferFree; + + long waitSyncStartSecs; + int waitSyncStartUSecs; + int waitSyncStart; } BufferStrategyControl; /* Pointers to shared state */ @@ -125,7 +141,29 @@ StrategyGetBuffer(BufferAccessStrategy strategy, bool *lock_held) /* Nope, so lock the freelist */ *lock_held = true; - LWLockAcquire(BufFreelistLock, LW_EXCLUSIVE); + if (!LWLockConditionalAcquire(BufFreelistLock, LW_EXCLUSIVE)) + { + TimestampTz waitStart = GetCurrentTimestamp(); + TimestampTz waitEnd; + long wait_secs; + int wait_usecs; + + LWLockAcquire(BufFreelistLock, LW_EXCLUSIVE); + + waitEnd = GetCurrentTimestamp(); + + TimestampDifference(waitStart, waitEnd, + wait_secs, wait_usecs); + + StrategyControl-waitBufferAllocSecs += wait_secs; + StrategyControl-waitBufferAllocUSecs += wait_usecs; + if (StrategyControl-waitBufferAllocUSecs 100) + { + StrategyControl-waitBufferAllocUSecs -= 100; + StrategyControl-waitBufferAllocSecs += 1; + } + StrategyControl-waitBufferAlloc++; + } /* * We count buffer allocation requests so that the bgwriter can estimate @@ -223,7 +261,29 @@ StrategyGetBuffer(BufferAccessStrategy strategy, bool *lock_held) void StrategyFreeBuffer(volatile BufferDesc *buf) { - LWLockAcquire(BufFreelistLock, LW_EXCLUSIVE); + if (!LWLockConditionalAcquire(BufFreelistLock, LW_EXCLUSIVE)) + { + TimestampTz waitStart = GetCurrentTimestamp(); + TimestampTz waitEnd; + long wait_secs; + int wait_usecs; + + LWLockAcquire(BufFreelistLock, LW_EXCLUSIVE); + + waitEnd = GetCurrentTimestamp(); + + TimestampDifference(waitStart, waitEnd, + wait_secs, wait_usecs); + + StrategyControl-waitBufferFreeSecs += wait_secs; + StrategyControl-waitBufferFreeUSecs += wait_usecs; + if (StrategyControl-waitBufferFreeUSecs 100) + { + StrategyControl-waitBufferFreeUSecs -= 100; + StrategyControl-waitBufferFreeSecs += 1; + } + StrategyControl-waitBufferFree++; + } /* * It is possible that we are told to put something in the freelist that @@ -256,7 +316,30 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc) { int result; - LWLockAcquire(BufFreelistLock, LW_EXCLUSIVE); + if (!LWLockConditionalAcquire(BufFreelistLock, LW_EXCLUSIVE)) + { + TimestampTz waitStart = GetCurrentTimestamp(); + TimestampTz waitEnd; + long wait_secs; + int wait_usecs; + + LWLockAcquire(BufFreelistLock, LW_EXCLUSIVE); + + waitEnd = GetCurrentTimestamp(); + + TimestampDifference(waitStart, waitEnd, + wait_secs, wait_usecs); + + StrategyControl-waitSyncStartSecs += wait_secs; + StrategyControl-waitSyncStartUSecs += wait_usecs; + if (StrategyControl-waitSyncStartUSecs 100) + { + StrategyControl-waitSyncStartUSecs -= 100; + StrategyControl-waitSyncStartSecs += 1; + } + StrategyControl-waitSyncStart++; + } + result = StrategyControl-nextVictimBuffer; if (complete_passes) *complete_passes = StrategyControl-completePasses; @@ -265,7 +348,59 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc) *num_buf_alloc = StrategyControl-numBufferAllocs; StrategyControl-numBufferAllocs = 0; } + else + { + long waitBufferAllocSecs; + int waitBufferAllocUSecs; + int waitBufferAlloc; + + long waitBufferFreeSecs; + int waitBufferFreeUSecs; + int waitBufferFree; + + long waitSyncStartSecs; + int waitSyncStartUSecs; + int waitSyncStart; + + waitBufferAllocSecs = StrategyControl-waitBufferAllocSecs; + waitBufferAllocUSecs =
Re: [HACKERS] Inline Extension
On Mon, Jan 23, 2012 at 8:17 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: Ok, but then, what about .so files? Wouldn't it make sense to be able to ship also the executable modules needed, and if not, why not? Now you can dump/restore any extension fully, and we can even ship any extension in the WAL stream (a new message is needed though). Things are still a bit ugly in the more complex cases: consider PostGIS's linkage against libproj and other libraries. In order to cover all cases, I feel that what I need is an optional hook (for the same of argument, let's say it's another command type hook, e.g. archive_command) to be executed when extension (un)installation is occurs on a primary or is replayed on a standby whereby I can acquire the necessary dependencies for an extension or signal some kind of error (as to exactly how that interfaces with the server is delicate, should one want to supply good error messages to the user). I think that hook could be useful for a number of reasons: * Extension distribution (as long as we're downloading dependent libraries, why not get the extension too?) * Extension distribution on standbys, too -- by much the same mechanism * Extension whitelisting for non-superusers (are you allowed/can you even have that extension?) And, more to the point, if one wants to make replication and extensions work nicely together, I don't really see an option outside such a hook other than insisting on a form of packaging whereby all dependencies are declared to Postgres and Postgres becomes the dependency management system for all dependent binary assets. That could lead to a more cohesive system, but is also a pretty hefty burden, both on this project and others. But getting back to in-line extensions: I think it makes sense to dump all extensions in their in-line representation even if in some situations carrying a copy of the extension in the backup is not strictly necessary. The only bloat is including the literal sourcetext of the extension in the dump. With a extension-installation hook, the literal version of the extension could be supplied but ignored if an side-channel mechanism for getting the extension makes sense. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On Mon, Jan 23, 2012 at 7:52 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Jan 23, 2012 at 3:49 PM, Robert Haas robertmh...@gmail.com wrote: The other patches have clearer and specific roles without heuristics (mostly), so are at least viable for 9.2, though still requiring agreement. I think we must also drop removebufmgrfreelist-v1 from consideration, ... I think you misidentify the patch. Earlier you said it that buffreelistlock-reduction-v1 crapped out and I already said that the assumption in the code clearly doesn't hold, implying the patch was dropped. Argh. I am clearly having a senior moment here, a few years early. So is it correct to say that both of the patches associated with message attached to the following CommitFest entry are now off the table for 9.2? https://commitfest.postgresql.org/action/patch_view?id=743 -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On 19/01/12 20:28, Hitoshi Harada wrote: (Now it occurred to me that forgetting the #include parse_func.h might hit this breakage..., so I'll fix it here and continue to test, but if you'll fix it yourself, let me know) I fixed it here and it now works with my environment. Well spotted; that's exactly what I'd done. :/ The regression tests pass, the feature seems working as aimed, but it seems to me that it needs more test cases and documentation. For the tests, I believe at least we need ambiguous case given upthread, so that we can ensure to keep compatibility. For the document, it should describe the name resolution rule, as stated in the patch comment. Attached are a new pair of patches, fixing the missing include (and the other warning), plus addressing the above. I'm still not sure whether to just revise (almost) all the SQL function examples to use parameter names, and declare them the right choice; as it's currently written, named parameters still seem rather second-class. Aside from them, I wondered at first what if the function is schema-qualified. Say, CREATE FUNCTION s.f(a int) RETURNS int AS $$ SELECT b FROM t WHERE a = s.f.a $$ LANGUAGE sql; It actually errors out, since function-name-qualified parameter only accepts function name without schema name, but it looked weird to me at first. No better idea from me at the moment, though. By my reading of (a draft of) the spec, Subclause 6.6, identifier chain, Syntax Rules 8.b.i-iii, the current behaviour is correct. But I join you in wondering whether we should permit the function name to be schema-qualified anyway. Matthew -- matt...@trebex.net diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml new file mode 100644 index 7064312..cc5b5ef *** a/doc/src/sgml/xfunc.sgml --- b/doc/src/sgml/xfunc.sgml *** SELECT getname(new_emp()); *** 538,556 programlisting CREATE FUNCTION tf1 (acct_no integer, debit numeric) RETURNS numeric AS $$ UPDATE bank ! SET balance = balance - $2 ! WHERE accountno = $1 RETURNING balance; $$ LANGUAGE SQL; /programlisting Here the first parameter has been given the name literalacct_no/, and the second parameter the name literaldebit/. ! So far as the SQL function itself is concerned, these names are just ! decoration; you must still refer to the parameters as literal$1/, ! literal$2/, etc within the function body. (Some procedural ! languages let you use the parameter names instead.) However, ! attaching names to the parameters is useful for documentation purposes. When a function has many parameters, it is also useful to use the names while calling the function, as described in xref linkend=sql-syntax-calling-funcs. --- 538,580 programlisting CREATE FUNCTION tf1 (acct_no integer, debit numeric) RETURNS numeric AS $$ UPDATE bank ! SET balance = balance - debit ! WHERE accountno = acct_no RETURNING balance; $$ LANGUAGE SQL; /programlisting Here the first parameter has been given the name literalacct_no/, and the second parameter the name literaldebit/. ! Named parameters can still be referenced as ! literal$replaceablen//; in this example, the second ! parameter can be referenced as literal$2/, literaldebit/, ! or literaltf1.debit/. ! /para ! ! para ! If a parameter is given the same name as a column that is available ! in the query, the name will refer to the column. To explicitly ! refer to the parameter, you can qualify its name with the name of ! the containing function. For example, ! ! programlisting ! CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$ ! UPDATE bank ! SET balance = balance - debit ! WHERE accountno = tf1.accountno ! RETURNING balance; ! $$ LANGUAGE SQL; ! /programlisting ! ! This time, the first parameter has been given the ambiguous name ! literalaccountno/. ! Notice that inside the function body, literalaccountno/ still ! refers to literalbank.accountno/, so literaltf1.accountno/ ! must be used to refer to the parameter. ! /para ! ! para When a function has many parameters, it is also useful to use the names while calling the function, as described in xref linkend=sql-syntax-calling-funcs. diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c new file mode 100644 index 5642687..fe87990 *** a/src/backend/executor/functions.c --- b/src/backend/executor/functions.c *** *** 23,28 --- 23,29 #include nodes/makefuncs.h #include nodes/nodeFuncs.h #include parser/parse_coerce.h + #include parser/parse_func.h #include tcop/utility.h #include utils/builtins.h #include utils/datum.h *** typedef SQLFunctionCache *SQLFunctionCac *** 115,121 --- 116,124 */
Re: [HACKERS] Next steps on pg_stat_statements normalisation
On 22 January 2012 05:30, Peter Geoghegan pe...@2ndquadrant.com wrote: The syntax for constants is sufficiently simple that I think that a good set of regression tests should make this entirely practicable, covering all permutations of relevant factors affecting how the implementation should act, including for example standard_conforming_strings. There's no reason to think that the SQL syntax rules for constants should need to change very frequently, or even at all, so we should be fine with just knowing the starting position. It's quicker and easier to do it this way than to argue the case for my original implementation, so that's what I'll do. Whatever overhead this independent, pg_stat_statements-internal const parsing may impose, it will at least only be paid once per query, when we first require a canonicalised representation of the query for the pg_stat_statements view. I've decided that a better approach to this problem is to use the low-level scanner API (declared in scanner.h) which is currently exclusively used for plpgsql. This seems to work well, as I'm using the authoritative scanner to scan constants. Obviously this does not imply that everyone must pay any overhead, so this seems like the best of both worlds. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Publish checkpoint timing and sync files summary data to pg_stat_bgwriter
Robert Haas wrote: On Sat, Jan 21, 2012 at 6:32 PM, Jeff Janes jeff.ja...@gmail.com wrote: I'm finding the backend_writes column pretty unfortunate. The only use I know of for it is to determine if the bgwriter is lagging behind. Yet it doesn't serve even this purpose because it lumps together the backend writes due to lagging background writes, and the backend writes by design due to the use buffer access strategy during bulk inserts. +1 for separating those. I'm tied up with some on-site things until Friday, can rev the patch with this in mind (and clear some of my overall review work pile) then. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum rate limit in KBps
Benedikt Grundmann wrote: What I think is missing is a clear way to know if you are vacuuming (and analyzing) enough, and how much you are paying for that. Any good way to measure if you're vacuuming a particular table enough needs to note how much free space is in that table and its indexes over time. That's why one of the other building blocks in the submission queue for 9.2 is a function to make that easier to do. It's one of the important dependencies to get settled before we can move very far toward answering am I vacuuming enough?. A second piece to that is recording a history of that information over time. Can't predict the future need for something without some record of its past to extrapolate from. That's probably a job better suited for an external tool. The way you'd want to audit it most easily is to graph it over time, which isn't the sort of thing PostgreSQL is likely to build in. Also, the proof of whether a suggested implementation for a vacuum meter was useful or not would be easiest to validate that way. No sense in doing the difficult work of building one until there's a working prototype, which is possible to do more quickly in languages other than C. A simple meter might not be possible to create even with some better building blocks to base it on. There is a lot of difference in this area that is workload dependent, and there are many types of database workloads out there. The two hardest systems to tune vacuum for that I work on have settled on completely different approaches to the problem. The only thing I've found so far that is true about both of them is that they'd really appreciate easier controls on the maximum rate. At the moment we are basically changing the knobs blindly based on some back of the envelope calculations and hearsay. Than sometimes month later we find out that eps we haven't been analyzing enough and that's why on that particular table the planner is now picking a bad query. Unlike VACUUM, ANALYZE is so cheap to run that it's possible to improve this situation more easily--just do it a lot more. Lowering autovacuum_analyze_scale_factor is the easiest way. By default that is 0.10, requiring approximately a 10% change in the table size before a new ANALYZE is done. I think the lowest production setting I have for that somewhere is 0.03 on a roughly terabyte scale database. There a 10% change in one the larger tables is well over the point of impacting query plans badly. If your data changes its character quite frequently based on new information, I wouldn't be afraid in that case to drop as low as 0.01 here. That would give you ANALYZE that happened 10X as often as it does now. You'll waste a moderate amount of CPU and disk resources, but a tuning error that leans toward analyzing too frequently isn't that expensive. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums
On Sat, Jan 21, 2012 at 6:12 PM, Jim Nasby j...@nasby.net wrote: On Jan 10, 2012, at 3:07 AM, Simon Riggs wrote: I think we could add an option to check the checksum immediately after we pin a block for the first time but it would be very expensive and sounds like we're re-inventing hardware or OS features again. Work on 50% performance drain, as an estimate. That is a level of protection no other DBMS offers, so that is either an advantage or a warning. Jim, if you want this, please do the research and work out what the probability of losing shared buffer data in your ECC RAM really is so we are doing it for quantifiable reasons (via old Google memory academic paper) and to verify that the cost/benefit means you would actually use it if we built it. Research into requirements is at least as important and time consuming as research on possible designs. Maybe I'm just dense, but it wasn't clear to me how you could use the information in the google paper to extrapolate data corruption probability. I can say this: we have seen corruption from bad memory, and our Postgres buffer pool (8G) is FAR smaller than available memory on all of our servers (192G or 512G). So at least in our case, CRCs that protect the filesystem cache would protect the vast majority of our memory (96% or 98.5%). Would it be unfair to assert that people who want checksums but aren't willing to pay the cost of running a filesystem that provides checksums aren't going to be willing to make the cost/benefit trade off that will be asked for? Yes, it is unfair of course, but it's interesting how small the camp of those using checksummed filesystems is. Robert Treat conjecture: xzilla.net consulting: omniti.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum rate limit in KBps
On Mon, Jan 23, 2012 at 3:21 AM, Benedikt Grundmann bgrundm...@janestreet.com wrote: On 19/01/12 17:39, Greg Smith wrote: On 1/19/12 1:10 PM, Robert Haas wrote: I have to say that I find that intensely counterintuitive. The current settings are not entirely easy to tune correctly, but at least they're easy to explain. If there's anyone out there who has run a larger PostgreSQL database and not at some point been extremely frustrated with how the current VACUUM settings are controlled, please speak up and say I'm wrong about this. I thought it was well understood the UI was near unusably bad, it just wasn't obvious what to do about it. We are frustrated but mostly our frustration is not about the somewhat inscrutable knobs but the inscrutable meters or lack there of. I keep thinking Greg has mistaken happiness with the MB based info in the vacuum patch as being happy without the output format, when really it is all about increased visibility. (For the record, we've backpatched that initial change to a large number of our customers, just cause we're a bit zealous about monitoring). Postgres (auto or manual for that matter) vacuuming and analyzing is essentially a performance tuning problem without a good way to measure the current performance, the fact that the knobs to turn are confusing as well is secondary. What I think is missing is a clear way to know if you are vacuuming (and analyzing) enough, and how much you are paying for that. At the moment we are basically changing the knobs blindly based on some back of the envelope calculations and hearsay. Than sometimes month later we find out that eps we haven't been analyzing enough and that's why on that particular table the planner is now picking a bad query. Hmm, I've always thought the answer here is just a systematic approach to operations. We monitor free space across the system (along with a bunch of other stuff) so that we know when we're not vacuuming / analyzing enough. What I want is that page http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html to start with Here is how you know if you are vacuuming enough... In an ideal world one would like some meter in a statistics table or similar that returns a percentage 100% means just enough 50% means you have to double 150% means 50% too much (e.g. wasted)... But I could do with a boolean as well. A complicated extension and the recommendation to install 3 different extensions would be better than what is there right now but only very barely. Of course a meter wouldn't tell you that if traffic doubled you would still keep up and for that you need a complicated calculation or (you just keep looking at the meter and adjust). But at the moment there is no such meter (at least I don't know of it) and that is the actual problem. These pieces are out there. I guess I'd say they are crude, but you can get a handle on it. Of course, if your problem is with analyze, that's cheap enough that you should probably just do it more. We're probably a lot more agressive on our vacuum / analyze scale settings than some people (we cut the defaults in half as a matter of course), and I come from the don't limit stuff camp too, but by and large what we do works, even if it's more black magic than people would like. :-) Robert Treat conjecture: xzilla.net consulting: omniti.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
** pgbench, permanent tables, scale factor 100, 300 s ** 1 group-commit-2012-01-21 614.425851 -10.4% 8 group-commit-2012-01-21 4705.129896 +6.3% 16 group-commit-2012-01-21 7962.131701 +2.0% 24 group-commit-2012-01-21 13074.939290 -1.5% 32 group-commit-2012-01-21 12458.962510 +4.5% 80 group-commit-2012-01-21 12907.062908 +2.8% Interesting. Comparing with this: http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php you achieved very small enhancement. Do you think of any reason which makes the difference? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basic pgbench runs with various performance-related patches
On 24 January 2012 06:26, Tatsuo Ishii is...@postgresql.org wrote: ** pgbench, permanent tables, scale factor 100, 300 s ** 1 group-commit-2012-01-21 614.425851 -10.4% 8 group-commit-2012-01-21 4705.129896 +6.3% 16 group-commit-2012-01-21 7962.131701 +2.0% 24 group-commit-2012-01-21 13074.939290 -1.5% 32 group-commit-2012-01-21 12458.962510 +4.5% 80 group-commit-2012-01-21 12907.062908 +2.8% Interesting. Comparing with this: http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php you achieved very small enhancement. Do you think of any reason which makes the difference? Presumably this system has a battery-backed cache, whereas my numbers were obtained on my laptop. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums
* Robert Treat: Would it be unfair to assert that people who want checksums but aren't willing to pay the cost of running a filesystem that provides checksums aren't going to be willing to make the cost/benefit trade off that will be asked for? Yes, it is unfair of course, but it's interesting how small the camp of those using checksummed filesystems is. Don't checksumming file systems currently come bundled with other features you might not want (such as certain vendors)? -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers