Re: [HACKERS] [PATCH] add --progress option to pgbench (submission 3)
Hi Febien, I send you my review result and refactoring patch. I think that your patch has good function and many people surely want to use! I hope that my review comment will be good for your patch. * 1. Complete words and variable in source code and sgml document. It is readable for user and developper that new patch completes words and variables in existing source code. For example, SECONDS is NUM etc. * 2. Output format in result for more readable. I think taht output format should be simple and intuitive. Your patch's output format is not easy to read very much. So I fix simple format and add Average latency. My proposed format is good for readable and programing processing. [mitsu-ko@localhost postgresql]$ bin/pgbench -T10 -P5 -c2 -j2 starting vacuum...end. 5.0 s[thread 1]: tps = 1015.576032, AverageLatency(ms) = 0.000984663 5.0 s[thread 0]: tps = 1032.580794, AverageLatency(ms) = 0.000968447 10.0 s [thread 0]: tps = 1129.591189, AverageLatency(ms) = 0.000885276 10.0 s [thread 1]: tps = 1126.267776, AverageLatency(ms) = 0.000887888 However, interesting of output format(design) is different depending on the person:-). If you like other format, fix it. * 3. Thread name in output format is not nesessary. I cannot understand that thread name is displayed in each progress. I think that it does not need. I hope that output result sould be more simple also in a lot of thread. My images is here, [mitsu-ko@localhost postgresql]$ bin/pgbench -T10 -P5 -c2 -j2 starting vacuum...end. 5.0 s: tps = 2030.576032, AverageLatency(ms) = 0.000984663 10.0 s : tps = 2250.591189, AverageLatency(ms) = 0.000885276 This output format is more simple and intuitive. If you need result in each threads, please tell us the reason. * 4. Slipping the progress time. Whan I executed this patch in long time, I found slipping the progress time. This problem image is here. [mitsu-ko@localhost postgresql]$ bin/pgbench -T10 -P5 -c2 starting vacuum...end. 5.1 s : tps = 2030.576032, AverageLatency(ms) = 0.000984663 10.2 s : tps = 2250.591189, AverageLatency(ms) = 0.000885276 It has problem in method of calculate progress time. It needs to fix collect, or displaying time format will be like '13:00:00'. If you select later format, it will fit in postgresql log and other contrib modules that are like pg_stat_statements. Best regards, -- Mitsumasa KONDO diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 1303217..32805ea 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -164,6 +164,7 @@ bool use_log; /* log transaction latencies to a file */ bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ +int progress = 0; /* thread progress report every this seconds */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -354,6 +355,8 @@ usage(void) protocol for submitting queries to server (default: simple)\n -n do not run VACUUM before tests\n -N do not update tables \pgbench_tellers\ and \pgbench_branches\\n + -P NUM, --progress NUM\n + show thread progress report about every NUM seconds\n -r report average latency per command\n -s NUM report this scale factor in output\n -S perform SELECT-only transactions\n @@ -2115,6 +2118,7 @@ main(int argc, char **argv) {unlogged-tables, no_argument, unlogged_tables, 1}, {sampling-rate, required_argument, NULL, 4}, {aggregate-interval, required_argument, NULL, 5}, + {progress, required_argument, NULL, 'P'}, {NULL, 0, NULL, 0} }; @@ -2181,7 +2185,7 @@ main(int argc, char **argv) state = (CState *) pg_malloc(sizeof(CState)); memset(state, 0, sizeof(CState)); - while ((c = getopt_long(argc, argv, ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:, long_options, optindex)) != -1) + while ((c = getopt_long(argc, argv, ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:P:, long_options, optindex)) != -1) { switch (c) { @@ -2336,6 +2340,16 @@ main(int argc, char **argv) exit(1); } break; + case 'P': +progress = atoi(optarg); +if (progress = 0) +{ + fprintf(stderr, + thread progress delay (-P) must not be negative (%s)\n, + optarg); + exit(1); +} +break; case 0: /* This covers long options which take no argument. */ break; @@ -2712,6 +2726,10 @@ threadRun(void *arg) int nstate = thread-nstate; int remains = nstate; /* number of remaining clients */ int i; + /* for reporting progress in benchmark result */ + int64 start_report = INSTR_TIME_GET_MICROSEC(thread-start_time); + int64 last_report = start_report; + int64 last_count = 0; AggVals aggs; @@ -2875,6
Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY
On 21 June 2013 06:54, David Fetter da...@fetter.org wrote: For example SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS file The spec is pretty specific about the all or none nature of naming in the AS clause...unless we can figure out a way of getting around it somehow. We already support and document the ability to provide a subset of the columns in the alias. I didn't realise that was beyond the spec, but since we already have it... I'm weighing in on the side of a name that's like ?columnN? elsewhere in the code, i.e. one that couldn't sanely be an actual column name, whether in table, view, or SRF. I don't think we need to be overly concerned with coming up with a unique name for the column. Duplicate column names are fine, except if the user wants to refer to them elsewhere in the query, in which case they need to provide aliases to distinguish them, otherwise the query won't be accepted. I'd be happy if you just replaced ?column? with ordinality in a couple of places in your original patch. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
On Thu, Jun 20, 2013 at 3:40 PM, MauMau maumau...@gmail.com wrote: Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? I think you are right, and there is no harm in leaving postgres processes in unkillable state. I'd like to leave the decision to you and/or others. +1 for leaving processes, not waiting for long (or possibly forever, remember not all people are running postgres on such cluster ware). I'm sure some users rely on the current behavior of immediate shutdown. If someone wants to ensure processes are finished when pg_ctl returns, is it fast shutdown, not immediate shutdown? To me the current immediate shutdown is reliable, in that it without doubt returns control back to terminal, after killing postmaster at least. Thanks, -- Hitoshi Harada
Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY
On 21 June 2013 08:02, Dean Rasheed dean.a.rash...@gmail.com wrote: On 21 June 2013 06:54, David Fetter da...@fetter.org wrote: For example SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS file The spec is pretty specific about the all or none nature of naming in the AS clause...unless we can figure out a way of getting around it somehow. We already support and document the ability to provide a subset of the columns in the alias. I didn't realise that was beyond the spec, but since we already have it... I'm weighing in on the side of a name that's like ?columnN? elsewhere in the code, i.e. one that couldn't sanely be an actual column name, whether in table, view, or SRF. I don't think we need to be overly concerned with coming up with a unique name for the column. Duplicate column names are fine, except if the user wants to refer to them elsewhere in the query, in which case they need to provide aliases to distinguish them, otherwise the query won't be accepted. I'd be happy if you just replaced ?column? with ordinality in a couple of places in your original patch. Expanding on that, I think it's perfectly acceptable to allow potentially duplicate column names in this context. For the majority of simple queries the user wouldn't have to (and wouldn't feel compelled to) supply aliases. Where there was ambiguity they would be forced to do so, but people are already used to that. If I write a simple query that selects from a single unnest() with or without ordinality, I probably won't want to supply aliases. If I select from 2 unnest()'s *without* ordinality, I already have to provide aliases. If I select from 2 SRF's functions with ordinality, I won't be too surprised if I am also forced to provide aliases. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).
Dmitriy Igrishin wrote: Sent: Thursday, June 20, 2013 5:09 PM To: PostgreSQL Hackers Subject: [HACKERS] Frontend/backend protocol improvements proposal (request). Hackers, While developing a C++ client library for Postgres I felt lack of extra information in command tags in the CommandComplete (B) message for the following commands: PREPARE; DEALLOCATE; DECLARE; CLOSE; LISTEN; UNLISTEN; SET; RESET. Namely, for example, users of my library can prepare statements by using protocol directly or via PREPARE command. Since the protocol does not supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE command. The library knows about all prepared statements and invalidates them automatically when user performs deallocate() wrapper. But users can go with DEALLOCATE command directly and in these cases I need to query the database to get the list of currently prepared statements whenever CommandComplete message with DEALLOCATE command tag is consumed. Moreover, I need to do it *synchronously* and this breaks asynchronous API. I propose to include name of the object in the CommandComplete (B) message for the above commands. That would be a change in the protocol, so it's not likely to happen soon. There is a page where proposed changes to the wire protocol are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes It seems like bad design to me to keep a list of prepared statements on the client side when it is already kept on the server side (accessible with the pg_prepared_statements view). What's wrong with the following: If the user wants to deallocate an individual prepared statement, just send DEALLOCATE statement name to the server. If the statement does not exist, the server will return an error. If the user wants to deallocate all statements, just send DEALLOCATE ALL. Why do you need to track prepared statements on the client side? Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Possible bug in CASE evaluation
I want to draw attention to this thread on -general: camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com Would you concur that this is a bug? The fine manual says about CASE: If the condition's result is true, the value of the CASE expression is the result that follows the condition, and the remainder of the CASE expression is not processed. But: test= CREATE FUNCTION zero() RETURNS integer IMMUTABLE LANGUAGE SQL AS 'SELECT 0'; CREATE FUNCTION test= SELECT CASE WHEN (SELECT zero()) = 0 THEN -1 ELSE 1/zero() END; ERROR: division by zero Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
Hi, I took results of my separate patches and original PG. * Result of DBT-2 | TPS 90%tileAverage Maximum -- original_0.7 | 3474.62 18.348328 5.73936.977713 original_1.0 | 3469.03 18.637865 5.84241.754421 fsync | 3525.03 13.872711 5.38228.062947 write | 3465.96 19.653667 5.80440.664066 fsync + write | 3564.94 16.31922 5.1 34.530766 - 'original_*' indicates checkpoint_completion_target in PG 9.2.4. - In other patched postgres, checkpoint_completion_target sets 0.7. - 'write' is applied write patch, and 'fsync' is applied fsync patch. - 'fsync + write' is applied both patches. * Investigation of result - Large value of checkpoint_completion_target in original and the patch in write become slow latency in benchmark transactions. Because slow write pages are caused long time fsync IO in final checkpoint. - The patch in fsync has an effect latency in each file fsync. Continued fsyncsin each files are caused slow latency. Therefore, it is good for latency that fsync stage in checkpoint has sleeping time after slow fsync IO. - The patches of fsync + write were seemed to improve TPS. I think that write patch does not disturb transactions which are in full-page-write WAL write than original(plain) PG. I will send you more detail investigation and result next week. And I will also take result in pgbench. If you mind other part of benchmark result or parameter of postgres, please tell me. Best Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).
2013/6/21 Albe Laurenz laurenz.a...@wien.gv.at Dmitriy Igrishin wrote: Sent: Thursday, June 20, 2013 5:09 PM To: PostgreSQL Hackers Subject: [HACKERS] Frontend/backend protocol improvements proposal (request). Hackers, While developing a C++ client library for Postgres I felt lack of extra information in command tags in the CommandComplete (B) message for the following commands: PREPARE; DEALLOCATE; DECLARE; CLOSE; LISTEN; UNLISTEN; SET; RESET. Namely, for example, users of my library can prepare statements by using protocol directly or via PREPARE command. Since the protocol does not supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE command. The library knows about all prepared statements and invalidates them automatically when user performs deallocate() wrapper. But users can go with DEALLOCATE command directly and in these cases I need to query the database to get the list of currently prepared statements whenever CommandComplete message with DEALLOCATE command tag is consumed. Moreover, I need to do it *synchronously* and this breaks asynchronous API. I propose to include name of the object in the CommandComplete (B) message for the above commands. That would be a change in the protocol, so it's not likely to happen soon. There is a page where proposed changes to the wire protocol are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes Well, even if this proposal moves to the TODO, it would be nice. It seems like bad design to me to keep a list of prepared statements on the client side when it is already kept on the server side (accessible with the pg_prepared_statements view). What's wrong with the following: If the user wants to deallocate an individual prepared statement, just send DEALLOCATE statement name to the server. If the statement does not exist, the server will return an error. If the user wants to deallocate all statements, just send DEALLOCATE ALL. Why do you need to track prepared statements on the client side? Nothing wrong if the user wants to deal with scary and cumbersome code. As library author, I want to help people make things simpler. To understand me, please look at the pseudo C++ code below. // A class designed to work with prepared statements class Prepared_statement { public: // Methods to generate a Bind message, like Prepared_statement* bind(Position, Value); // ... and more // Methods to send Execute message, like void execute(); void execute_async(); }; class Connection { public: // many stuff ... void close(); Prepared_statement* prepare(Name, Query); void prepare_async(Statement); // Make yet another instance of prepared statement. Prepared_statement* prepared_statement(Name); // etc. }; The Connection class is a factory for Prepared_statement instances. As you can see, the Connection::prepare() returns new instance of *synchronously* prepared statement. Next, the user can bind values and execute the statement, like this: void f(Connection* cn) { // Prepare unnamed statement and execute it. cn-prepare(SELECT $1::text)-bind(0, Albe)-execute(); // Ps: don't worry about absence of delete; We are using smart pointers :-) } But there is a another possible case: void f(Connection* cn) { Prepared_statement* ps = cn-prepare(SELECT $1::text); cn-close(); // THIS SHOULD invalidate all Prepared_statement instances ... ps-bind(0, Albe); // ... to throw the exception here } Moreover, consider: void f(Connection* cn) { Prepared_statement* ps1 = cn-prepare(ps1, SELECT $1::text); cn-deallocate(ps1); // THIS invalidates ps1 object... ps1-bind(0, Albe); // ... to throw the exception here Prepared_statement* ps2 = cn-prepare(ps2, SELECT $1::text); cn-perform(DEALLOCATE ps2); // THIS SHOULD ALSO invalidate ps2 object... ps2-bind(0, Albe); // ... to throw the exception here } In the latter case when the user deallocates named prepared statement directly, the implementation of Connection can invalidates the prepared statement (ps2) by analyzing and parsing CommandComplete command tag to get it's name. And please note, that the user can send DEALLOCATE asynchronously. And there is only two ways to get the prepared statement (or another session object's) name: 1) Parse the SQL command which the user is attempts to send; 2) Just get it from CommandComplete command tag. I beleive that the 1) is a 100% bad idea. PS: this C++11 library is not publicaly available yet, but I hope it will this year. -- // Dmitriy.
Re: [HACKERS] Possible bug in CASE evaluation
Hi, On 2013-06-21 08:16:22 +, Albe Laurenz wrote: I want to draw attention to this thread on -general: camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com There's also a bug reported for it: #8237: e1uovmc-0007ft...@wrigleys.postgresql.org Would you concur that this is a bug? Yes, I think it's pretty clearly a bug - Tom doesn't seem think so though. If we can agree it is, the fix outlined over on -bugs seems to be easily enough implemented... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
On 21 June 2013 05:01, David Fetter da...@fetter.org wrote: What tests do you think should be there that aren't? I think I expected to see more tests related to some of the specific code changes, such as CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(x); -- Should fail (filter can't be used for non-aggregates) SELECT abs(x) FILTER (WHERE x 5) FROM t; -- Should fail (filter clause can't contain aggregates) SELECT array_agg(x) FILTER (WHERE x AVG(x)) t; -- OK (aggregate in filter sub-query) SELECT array_agg(x) FILTER (WHERE x (SELECT AVG(x) FROM t)) FROM t; -- OK (trivial sub-query) SELECT array_agg(x) FILTER (WHERE (SELECT x 5)) FROM t; -- OK SELECT array_agg(x) FILTER (WHERE (SELECT x (SELECT AVG(x) FROM t))) FROM t; -- OK SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x AVG(t2.x) FROM t as t2))) FROM t; -- Should fail SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT 5 AVG(t.x) FROM t as t2))) FROM t; -- OK (filtered aggregate in window context) SELECT x, AVG(x) OVER(ORDER BY x), AVG(x) FILTER (WHERE x 5) OVER(ORDER BY x) FROM t; -- Should fail (non-aggregate window function with filter) SELECT x, rank() OVER(ORDER BY x), rank() FILTER (WHERE x 5) OVER(ORDER BY x) FROM t; -- View definition test CREATE VIEW v AS SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x AVG(t2.x) FROM t as t2))) FROM t; \d+ v etc... You could probably dream up better examples --- I just felt that the current tests don't give much coverage of the code changes. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]
On 21 June 2013 06:16, David Fetter da...@fetter.org wrote: On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: David Fetter escribió: On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: Per the spec, B) A filter clause shall not contain a query expression, a window function, or an outer reference. If this is not allowed, I think there should be a clearer error message. What Dean showed is an internal (not user-facing) error message. Please find attached a patch which allows subqueries in the FILTER clause and adds regression testing for same. Nice! I should have pointed out that it was already working for some sub-queries, just not all. It's good to see that it was basically just a one-line fix, because I think it would have been a shame to not support sub-queries. I hope to take another look at it over the weekend. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 2013-06-19 09:55:24 +0900, Michael Paquier wrote: Please find an updated patch. The regression test rules has been updated, and all the comments are addressed. On Tue, Jun 18, 2013 at 6:35 PM, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2013-06-18 10:53:25 +0900, Michael Paquier wrote: diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c index c381f11..3a6342c 100644 --- a/contrib/pg_upgrade/info.c +++ b/contrib/pg_upgrade/info.c @@ -321,12 +321,17 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo) INSERT INTO info_rels SELECT reltoastrelid FROM info_rels i JOIN pg_catalog.pg_class c -ON i.reloid = c.oid)); +ON i.reloid = c.oid +AND c.reltoastrelid != %u, InvalidOid)); PQclear(executeQueryOrDie(conn, INSERT INTO info_rels - SELECT reltoastidxid - FROM info_rels i JOIN pg_catalog.pg_class c -ON i.reloid = c.oid)); + SELECT indexrelid + FROM pg_index + WHERE indrelid IN (SELECT reltoastrelid + FROM pg_class + WHERE oid = %u +AND reltoastrelid != %u), + FirstNormalObjectId, InvalidOid)); What's the idea behind the = here? It is here to avoid fetching the toast relations of system tables. But I see your point, the inner query fetching the toast OIDs should do a join on the exising info_rels and not try to do a join on a plain pg_index, so changed this way. I'd also rather not introduce knowledge about FirstNormalObjectId into client applications... But you fixed it already. /* Clean up. */ heap_freetuple(reltup1); @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, if (OidIsValid(newrel-rd_rel-reltoastrelid)) { Relationtoastrel; - Oid toastidx; charNewToastName[NAMEDATALEN]; + ListCell *lc; + int count = 0; toastrel = relation_open(newrel-rd_rel-reltoastrelid, AccessShareLock); - toastidx = toastrel-rd_rel-reltoastidxid; + RelationGetIndexList(toastrel); relation_close(toastrel, AccessShareLock); /* rename the toast table ... */ @@ -1543,11 +1585,23 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, RenameRelationInternal(newrel-rd_rel-reltoastrelid, NewToastName, true); - /* ... and its index too */ - snprintf(NewToastName, NAMEDATALEN, pg_toast_%u_index, - OIDOldHeap); - RenameRelationInternal(toastidx, - NewToastName, true); + /* ... and its indexes too */ + foreach(lc, toastrel-rd_indexlist) + { + /* + * The first index keeps the former toast name and the + * following entries have a suffix appended. + */ + if (count == 0) + snprintf(NewToastName, NAMEDATALEN, pg_toast_%u_index, + OIDOldHeap); + else + snprintf(NewToastName, NAMEDATALEN, pg_toast_%u_index_%d, + OIDOldHeap, count); + RenameRelationInternal(lfirst_oid(lc), +
Re: [HACKERS] Possible bug in CASE evaluation
Andres Freund wrote: On 2013-06-21 08:16:22 +, Albe Laurenz wrote: I want to draw attention to this thread on -general: camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com There's also a bug reported for it: #8237: e1uovmc-0007ft...@wrigleys.postgresql.org Would you concur that this is a bug? Yes, I think it's pretty clearly a bug - Tom doesn't seem think so though. If we can agree it is, the fix outlined over on -bugs seems to be easily enough implemented... I think that it is surprising behaviour. If fixing the behaviour is undesirable, at least the documentation should be fixed. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updated emacs configuration
Hi, Peter Eisentraut pete...@gmx.net writes: I think the suggested emacs configuration snippets in src/tools/editors/emacs.samples no longer represent current best practices. I have come up with some newer things that I'd like to propose for review. Thanks for doing that! First, I propose adding a .dir-locals.el file to the top-level directory with basic emacs settings. These get applied automatically. This especially covers the particular tab and indentation settings that PostgreSQL uses. With this, casual developers will not need to modify any of their emacs settings. I've tested that on a new git clone and with the `emacs -q` command so as not to load any of my local setup. While the indentation seemed ok, the placement of the comments seems way off: Compare what you see using those commands: emacs -q src/backend/commands/extension.c emacs -q -l ../emacs.samples src/backend/commands/extension.c (When using macosx, you might have to replace the 'emacs' binary location with /Applications/Emacs.app/Contents/MacOS/Emacs). I did also test on doc/src/sgml/extend.sgml and some Makefile, only with using the emacs.samples file content though. With that, emacs.samples can be shrunk significantly. The only real reason to keep is that that c-offsets-alist and (more dubiously) sgml-basic-offset cannot be set from .dir-locals.el because they are not safe. I have also removed many of the redundant examples and settled on a hook-based solution. A couple of notes about your emacs.sample file: - Name the lambda used in the hook for easier removing / reference - A fresh git clone will create a directory named postgres, so I did change your /postgresql/ regex to /postgres/ in my attached version I think together this setup would be significantly simpler and more practical. Agreed. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support ;; -*- mode: emacs-lisp -*- ;; This file contains code to set up Emacs to edit PostgreSQL source ;; code. Copy these snippets into your .emacs file or equivalent, or ;; use load-file to load this file directly. ;; ;; Note also that there is a .dir-locals.el file at the top of the ;; PostgreSQL source tree, which contains many of the settings shown ;; here. So for light editing, you might not need any additional ;; Emacs configuration. ;;; C files ;; Style that matches the formatting used by ;; src/tools/pgindent/pgindent. Many extension projects also use this ;; style. (c-add-style postgresql '(bsd (c-basic-offset . 4) (c-offsets-alist . ((case-label . +))) (fill-column . 79) (indent-tabs-mode . t) (tab-width . 4))) (add-hook 'c-mode-hook (defun postgresql-c-mode-hook () (when (string-match /postgres/ buffer-file-name) (c-set-style postgresql ;;; documentation files (add-hook 'sgml-mode-hook (defun postgresql-sgml-mode-hook () (when (string-match /postgres/ buffer-file-name) (setq fill-column 79) (setq indent-tabs-mode nil) (setq sgml-basic-offset 1 ;;; Makefiles ;; use GNU make mode instead of plain make mode (add-to-list 'auto-mode-alist '(/postgres/.*Makefile.* . makefile-gmake-mode)) (add-to-list 'auto-mode-alist '(/postgres/.*\\.mk\\' . makefile-gmake-mode)) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refresh materialized view concurrently
On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for 9.4 CF1. The goal of this patch is to allow a refresh without interfering with concurrent reads, using transactional semantics. I spent a few hours to review the patch. As far as I can tell, the overall approach is as follows. - create a new temp heap as non-concurrent does, but with ExclusiveLock on the matview, so that reader wouldn't be blocked - with this temp table and the matview, query FULL JOIN and extract difference between original matview and temp heap (via SPI) - this operation requires unique index for performance reason (or correctness reason too) - run ANALYZE on this diff table - run UPDATE, INSERT and DELETE via SPI, to do the merge - close these temp heap If I don't miss something, the requirement for the CONCURRENTLY option is to allow simple SELECT reader to read the matview concurrently while the view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR UPDATE/SHARE are still blocked. So, I wonder why it is not possible just to acquire ExclusiveLock on the matview while populating the data and swap the relfile by taking small AccessExclusiveLock. This lock escalation is no dead lock hazard, I suppose, because concurrent operation would block the other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts AccessExclusiveLock. Then you don't need the complicated SPI logic or unique key index dependency. Assuming I'm asking something wrong and going for the current approach, here's what I've found in the patch. - I'm not sure if unique key index requirement is reasonable or not, because users only add CONCURRENTLY option and not merging or incremental update. - This could be an overflow in diffname buffer. + quoteRelationName(tempname, tempRel); + strcpy(diffname, tempname); + strcpy(diffname + strlen(diffname) - 1, _2\); - As others pointed out, quoteOneName can be replaced with quote_identifier - This looks harmless, but I wonder if you need to change relkind. *** 665,672 make_new_heap(Oid OIDOldHeap, Oid NewTableSpace) OldHeap-rd_rel-relowner, OldHeapDesc, NIL, ! OldHeap-rd_rel-relkind, ! OldHeap-rd_rel-relpersistence, false, RelationIsMapped(OldHeap), true, --- 680,687 OldHeap-rd_rel-relowner, OldHeapDesc, NIL, ! RELKIND_RELATION, ! relpersistence, false, RelationIsMapped(OldHeap), true, Since OldHeap-rd_rel-relkind has been working with 'm', too, not only 'r'? - I found two additional parameters on make_new_heap ugly, but couldn't come up with better solution. Maybe we can pass Relation of old heap to the function instead of Oid.. That's about it from me. Thanks, -- Hitoshi Harada
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On Thu, Jun 20, 2013 at 7:24 PM, Craig Ringer cr...@2ndquadrant.comwrote: I've missed this feature more than once, and am curious about whether any more recent changes may have made it cleaner to tackle this, or whether consensus can be formed on adding the new entries to btree's opclass to avoid the undesirable explicit lookups of the '+' and '-' oprators. As far as I know the later development didn't add anything to help this conversation. I initially thought range type or knn gist would add something, but they were something else far from this. On the other hand, if this makes it, it'll also open doors to range PARTITION BY for CREATE TABLE command, so the impact will be bigger than you may think. I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Thanks, -- Hitoshi Harada
Re: [HACKERS] Config reload/restart preview
On 21 June 2013 05:47, Gurjeet Singh gurj...@singh.im wrote: On Thu, Jun 20, 2013 at 9:33 AM, Magnus Hagander mag...@hagander.netwrote: On Thu, Jun 20, 2013 at 2:54 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: Should we have a way of previewing changes that would be applied if we reloaded/restarted the server? Yes, we should. +1 This would go well with something I started working on some time ago (but haven't actually gotten far on at all), which is the ability for pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload should also be able to tell you which parameters were changed, without having to go to the log. Obviously that's almost exactly the same feature. It could probably connect to the server and issue the SQL command to reload, and that one could probably get enhanced to return modified variable as NOTICE, or be run with the right client_min_message: SELECT pg_reload_conf(); The pg_ctl client would then have to know to display the messages sent back by the server. The problem with that is that now you must *always* have your system set up to allow the postgres user to log in in pg_hba.conf or it fails. But yes, one option would be to use SQL instead of opening a socket. Maybe that's a better idea - have pg_ctl try to use that if available, and if not send a regular signal and not try to collect the output. I started working on it yesterday after Thom proposed this idea internally at EDB. The discussion until now doesn't seem to be hostile to a SQL interface, so attached is a hack attempt, which hopefully will serve at least as a POC. A sample session is shown below, while changing a few values in postgresql.conf files. The central idea is to use the SIGHUP processing function to do the work for us and report potential changes via DEBUG2, instead of having to write a new parsing engine. The (GUC-nesting + PGC_S_TEST) is nice to have since it avoids the current session from adopting the values that are different in conf file. This approach is susceptible to the fact that the connected superuser may have its GUC values picked up from user/database/session level settings (ALTER USER/DATABASE .. SET ; or SET param TO val;). $ pgsql Expanded display is used automatically. psql (9.4devel) Type help for help. postgres=# show work_mem; work_mem -- 1MB (1 row) postgres=# set client_min_messages = debug2; SET postgres=# select pg_test_reload_conf(); DEBUG: parameter work_mem changed to 70MB pg_test_reload_conf - t (1 row) postgres=# show work_mem; work_mem -- 1MB (1 row) postgres=# select pg_test_reload_conf(); DEBUG: parameter shared_buffers cannot be changed without restarting the server DEBUG: configuration file /home/gurjeet/dev/pgdbuilds/report_guc_chanege_pre_reload/db/data/postgresql.conf contains errors; unaffected changes were applied pg_test_reload_conf - t (1 row) postgres=# select pg_test_reload_conf(); DEBUG: parameter log_min_messages removed from configuration file, reset to default pg_test_reload_conf - t (1 row) Thanks for taking a look at this Gurjeet. This seems to be a step towards it, but there are a few issues: 1) As you say, if the superuser has role-level settings, it will provide incorrect feedback. 2) Settings that require a restart don't tell you what there are currently set to and what they would be set to. I'm not sure the currently-logged text provided by a SIGHUP is necessarily appropriate for such a feature. 3) I'd expect that a DBA that goes editing postgresql.conf wouldn't then go logging in to the database to run this command, if they even knew it existed. Whereas they would typically be familiar with /etc/init.d/postgresql action or pg_ctl action. Now if the SQL function was a medium to achieve this, that would be fine, but I'm not clear on what would be required technically to achieve the kind of intuitive admin-friendly interface. Or maybe there's a better method of checking the config that I haven't considered. -- Thom
[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
On 2013-06-20 22:36:45 -0400, Alvaro Herrera wrote: Noah Misch escribió: On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote: MauMau escribi?: Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? It would be valuable for pg_ctl -w -m immediate stop to have the property that an subsequent start attempt will not fail due to the presence of some backend still attached to shared memory. (Maybe that's true anyway or can be achieved a better way; I have not investigated.) Well, the only case where a process that's been SIGKILLed does not go away, as far as I know, is when it is in some uninterruptible sleep due to in-kernel operations that get stuck. Personally I have never seen this happen in any other case than some network filesystem getting disconnected, or a disk that doesn't respond. And whenever the filesystem starts to respond again, the process gets out of its sleep only to die due to the signal. Those are the situation in which it takes a really long time, yes. But there can be timing issues involved. Consider a backend that's currently stuck in a write() because it hit the dirtying limit. Say you have a postgres cluster that's currently slowing down to a crawl because it's overloaded and hitting the dirty limit. Somebody very well might just want to restart it with -m immediate. In that case a delay of a second or two till enough dirty memory has been written that write() can continue is enough for the postmaster to start up again and try to attach to shared memory where it will find the shared memory to be still in use. I don't really see any argument for *not* waiting. Sure it might take a bit longer till it's shut down, but if it didn't wait that will cause problems down the road. If we leave postmaster running after SIGKILLing its children, the only thing we can do is have it continue to SIGKILL processes continuously every few seconds until they die (or just sit around doing nothing until they all die). I don't think this will have a different effect than postmaster going away trusting the first SIGKILL to do its job eventually. I think it should just wait till all its child processes are dead. No need to repeat sending the signals - as you say, that won't help. What we could do to improve the robustness a bit - at least on linux - is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be killed if the postmaster goes away... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refresh materialized view concurrently
On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada umi.tan...@gmail.comwrote: On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for 9.4 CF1. The goal of this patch is to allow a refresh without interfering with concurrent reads, using transactional semantics. I spent a few hours to review the patch. Oh, BTW, though it is not part of this patch, but I came across this. ! /* ! * We're not using materialized views in the system catalogs. ! */ Assert(!IsSystemRelation(matviewRel)); Of course we don't have builtin matview on system catalog, but it is possible to create such one by allow_system_table_mods=true, so Assert doesn't look like good to me. Thanks, -- Hitoshi Harada
Re: [HACKERS] refresh materialized view concurrently
On 2013-06-21 02:43:23 -0700, Hitoshi Harada wrote: On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada umi.tan...@gmail.comwrote: On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for 9.4 CF1. The goal of this patch is to allow a refresh without interfering with concurrent reads, using transactional semantics. I spent a few hours to review the patch. Oh, BTW, though it is not part of this patch, but I came across this. ! /* ! * We're not using materialized views in the system catalogs. ! */ Assert(!IsSystemRelation(matviewRel)); Of course we don't have builtin matview on system catalog, but it is possible to create such one by allow_system_table_mods=true, so Assert doesn't look like good to me. Anybody using allow_system_table_mods gets to keep the pieces. There are so many ways to break just about everything things using it that I don't think worrying much makes sense. If you want you could replace that by an elog(), but... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add session_preload_libraries configuration parameter
Hi, Peter Eisentraut pete...@gmx.net writes: This is like shared_preload_libraries except that it takes effect at backend start and can be changed without a full postmaster restart. It is like local_preload_libraries except that it is still only settable by a superuser. This can be a better way to load modules such as auto_explain. I had a pretty hard time to get my head around that new one, and I'm not sure I totally did. The important things to me are: - No need to manually copy the lib into the plugin directory, - ALTER ROLE support. So basically it's a very good solution for auto_explain and any other module you want to load eagerly but not for everyone and when not using shared memory. I have a feeling that something simpler could be made, but I will have to continue thinking about it. I found it strange that those two paras read differently for saying the same thing? +preloaded at connection start. This parameter cannot be changed after +the start of a particular session. If a specified library is not +The parameter value only takes effect at the start of the connection. +Subsequent changes have no effect. If a specified library is not Will review the code in more details, wanted to get back on the context where this patch is useful first, and try to understand better the trade offs involved. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in CASE evaluation
2013/6/21 Andres Freund and...@2ndquadrant.com: Hi, On 2013-06-21 08:16:22 +, Albe Laurenz wrote: I want to draw attention to this thread on -general: camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com There's also a bug reported for it: #8237: e1uovmc-0007ft...@wrigleys.postgresql.org Would you concur that this is a bug? Yes, I think it's pretty clearly a bug - Tom doesn't seem think so though. If we can agree it is, the fix outlined over on -bugs seems to be easily enough implemented... fix is not easy :-( you should to catch any possible exception, so it means, so you should to do some optimalization under subtransactions - or you should not do this kind of optimizations. Pavel Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for removng unused targets
On Thu, Jun 20, 2013 at 12:19 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: From: Hitoshi Harada [mailto:umi.tan...@gmail.com] I guess the patch works fine, but what I'm saying is it might be limited to small use cases. Another instance of this that I can think of is ORDER BY clause of window specifications, which you may want to remove from the target list as well, in addition to ORDER BY of query. It will just not be removed by this approach, simply because it is looking at only parse-sortClause. Certainly you can add more rules to the new function to look at the window specification, but then I'm not sure what we are missing. Yeah, I thought the extension to the window ORDER BY case, too. But I'm not sure it's worth complicating the code, considering that the objective of this optimization is to improve full-text search related things if I understand correctly, though general solutions would be desirable as you mentioned. Ah, I see the use case now. Makes sense. So, as it stands it doesn't have critical issue, but more generalized approach would be desirable. That said, I don't have strong objection to the current patch, and just posting one thought to see if others may have the same opinion. OK. I'll also wait for others' comments. For review, an updated version of the patch is attached, which fixed the bug using the approach that directly uses the clause information in the parse tree. I tried several ways but I couldn't find big problems. Small typo: s/rejunk/resjunk/ I defer to commiter. Thanks, -- Hitoshi Harada
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On 06/21/2013 05:32 PM, Hitoshi Harada wrote: I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That should make sense for any type in which the concept of zero makes sense. Thanks for the warning on that issue. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trgm regex index peculiarity
On Fri, June 21, 2013 05:25, Tom Lane wrote: Erik Rijkers e...@xs4all.nl writes: In a 112 MB test table (containing random generated text) with a trgm index (gin_trgm_ops), I consistently get these timings: select txt from azjunk6 where txt ~ '^abcd'; 130 ms select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; 3 ms Hm, could you provide a self-contained test case? yes, sorry. I tested on a 1M row table: #!/bin/sh # create table: for power in 6; do table=azjunk${power} index=${table}_trgm_re_idx perl -E' sub ss{ join,@_[ map{rand @_} 1 .. shift ] }; say(ss(80,a..g, ,h..m, ,n..s, ,t..z)) for 1 .. 1e'${power}; \ | psql -aqXc drop table if exists $table; create table $table(txt text); copy $table from stdin;; echo set session maintenance_work_mem='1GB'; create index $index on $table using gin (txt gin_trgm_ops); analyze $table; | psql -qtAX; done # test: echo \\timing on explain analyze select txt from azjunk6 where txt ~ '^abcd'; -- slow (140 ms) explain analyze select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; -- fast (5 ms) | psql -Xqa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for removng unused targets
From: Hitoshi Harada [mailto:umi.tan...@gmail.com] I tried several ways but I couldn't find big problems. Small typo: s/rejunk/resjunk/ Thank you for the review. Attached is an updated version of the patch. Thanks, Best regards, Etsuro Fujita unused-targets-20130621.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer cr...@2ndquadrant.com wrote: On 06/21/2013 05:32 PM, Hitoshi Harada wrote: I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That should make sense for any type in which the concept of zero makes sense. Yeah, I mean, it needs to know if offset is negative or not by testing with zero. So we need zero value or is_negative function for each type. Thanks, -- Hitoshi Harada
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Fri, Jun 21, 2013 at 6:19 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-06-19 09:55:24 +0900, Michael Paquier wrote: /* Clean up. */ heap_freetuple(reltup1); @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, if (OidIsValid(newrel-rd_rel-reltoastrelid)) { Relationtoastrel; - Oid toastidx; charNewToastName[NAMEDATALEN]; + ListCell *lc; + int count = 0; toastrel = relation_open(newrel-rd_rel-reltoastrelid, AccessShareLock); - toastidx = toastrel-rd_rel-reltoastidxid; + RelationGetIndexList(toastrel); relation_close(toastrel, AccessShareLock); /* rename the toast table ... */ @@ -1543,11 +1585,23 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, RenameRelationInternal(newrel-rd_rel-reltoastrelid, NewToastName, true); - /* ... and its index too */ - snprintf(NewToastName, NAMEDATALEN, pg_toast_%u_index, - OIDOldHeap); - RenameRelationInternal(toastidx, - NewToastName, true); + /* ... and its indexes too */ + foreach(lc, toastrel-rd_indexlist) + { + /* + * The first index keeps the former toast name and the + * following entries have a suffix appended. + */ + if (count == 0) + snprintf(NewToastName, NAMEDATALEN, pg_toast_%u_index, + OIDOldHeap); + else + snprintf(NewToastName, NAMEDATALEN, pg_toast_%u_index_%d, + OIDOldHeap, count); + RenameRelationInternal(lfirst_oid(lc), + NewToastName, true); + count++; + } } relation_close(newrel, NoLock); } Is it actually possible to get here with multiple toast indexes? Actually it is possible. finish_heap_swap is also called for example in ALTER TABLE where rewriting the table (phase 3), so I think it is better to protect this code path this way. But why would we copy invalid toast indexes over to the new relation? Shouldn't the new relation have been freshly built in the previous steps? What do you think about that? Using only the first valid index would be enough? diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h index 8ac2549..31309ed 100644 --- a/src/include/utils/relcache.h +++ b/src/include/utils/relcache.h @@ -29,6 +29,16 @@ typedef struct RelationData *Relation; typedef Relation *RelationPtr; /* + * RelationGetIndexListIfValid + * Get index list of relation without recomputing it. + */ +#define RelationGetIndexListIfValid(rel) \ +do { \ + if (rel-rd_indexvalid == 0) \ + RelationGetIndexList(rel); \ +} while(0) Isn't this function misnamed and should be RelationGetIndexListIfInValid? When naming that; I had more in mind: get the list of indexes if it is already there. It looks more intuitive to my mind. I can't follow. RelationGetIndexListIfValid() doesn't return anything. And it doesn't do anything if the list is already valid. It only does something iff the list currently is invalid. In this case RelationGetIndexListIfInvalid? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trgm regex index peculiarity
On Fri, Jun 21, 2013 at 2:40 PM, Erik Rijkers e...@xs4all.nl wrote: On Fri, June 21, 2013 05:25, Tom Lane wrote: Erik Rijkers e...@xs4all.nl writes: In a 112 MB test table (containing random generated text) with a trgm index (gin_trgm_ops), I consistently get these timings: select txt from azjunk6 where txt ~ '^abcd'; 130 ms select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; 3 ms Hm, could you provide a self-contained test case? yes, sorry. I tested on a 1M row table: #!/bin/sh # create table: for power in 6; do table=azjunk${power} index=${table}_trgm_re_idx perl -E' sub ss{ join,@_[ map{rand @_} 1 .. shift ] }; say(ss(80,a..g, ,h..m, ,n..s, ,t..z)) for 1 .. 1e'${power}; \ | psql -aqXc drop table if exists $table; create table $table(txt text); copy $table from stdin;; echo set session maintenance_work_mem='1GB'; create index $index on $table using gin (txt gin_trgm_ops); analyze $table; | psql -qtAX; done # test: echo \\timing on explain analyze select txt from azjunk6 where txt ~ '^abcd'; -- slow (140 ms) explain analyze select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; -- fast (5 ms) | psql -Xqa Regex '^abcd' will be expanded into trigrams '__a', '_ab', 'abc' and 'bcd'. However trigrams '__a' is much more frequent than '_ab' which in turn is much more frequent than 'abc' and 'bcd'. Ommiting of ^ leads to ommiting of '__a' and '_ab' and that gives so significant speedup. The problem is that trigram regex engine doesn't know that '__a' and '_ab' is more frequent than another trigrams because we don't know their distribution. However, simple assumption that blank character (in pg_trgm meaning) is much more frequent than others seems to be true for most cases. Attached patch implementing this assumption. It introduces BLANK_COLOR_SIZE macro which is used in selectColorTrigrams like count of characters in COLOR_BLANK. Another change in this patch is split of MAX_TRGM_COUNT into WISH_TRGM_SIZE and MAX_TRGM_SIZE. The idea is to keep trying to remove color trigrams from graph even when it fits into MAX_TRGM_SIZE, because we are intending to scan less posting lists/trees. Comments is not fixed yet, coming soon. -- With best regards, Alexander Korotkov. trgm_regex_optimize.1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Fri, Jun 21, 2013 at 12:18 AM, Jeff Davis pg...@j-davis.com wrote: On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote: I think the question is whether this feature is really worth adding new reserved keywords for. I have a hard time saying we shouldn't support something that's part of the SQL standard, but personally, it's not something I've seen come up prior to this thread. What's the next step here? Well, ideally, some other people weigh in on the value of the feature vs. the pain of reserving the keywords. The feature sounds useful to me. ...and there's one person with an opinion now! :-) The other question here is - do we actually have the grammar right? As in, is this actually the syntax we're supposed to be implementing? It looks different from what's shown here, where the IGNORE NULLS is inside the function's parentheses, rather than afterwards: http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html IBM seems to think it's legal either inside or outside the parentheses: http://pic.dhe.ibm.com/infocenter/informix/v121/index.jsp?topic=%2Fcom.ibm.sqls.doc%2Fids_sqs_2594.htm Regardless of what syntax we settle on, we should also make sure that the conflict is intrinsic to the grammar and can't be factored out, as Tom suggested upthread. It's not obvious to me what the actual ambiguity is here. If you've seen select lag(num,0) and the lookahead token is respect, what's the problem? It sort of looks like it could be a column label, but not even unreserved keywords can be column labels, so that's not it. Probably deserves a bit more investigation... If the grammar is unacceptable, does someone have an alternative idea, like using new function names instead of grammar? If so, what are reasonable names to use? We could just add additional, optional Boolean argument to the existing functions. It's non-standard, but we avoid adding keywords. Also, I think someone mentioned this already, but what about first_value() and last_value()? Shouldn't we do those at the same time? Not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
From: Alvaro Herrera alvhe...@2ndquadrant.com MauMau escribió: One concern is that umount would fail in such a situation because postgres has some open files on the filesystem, which is on the shared disk in case of traditional HA cluster. See my reply to Noah. If postmaster stays around, would this be any different? I don't think so. I'll consider this a bit and respond as a reply to Andres-san's mail. Actually, in further testing I noticed that the fast-path you introduced in BackendCleanup (or was it HandleChildCrash?) in the immediate shutdown case caused postmaster to fail to clean up properly after sending the SIGKILL signal, so I had to remove that as well. Was there any reason for that? You are talking about the code below at the beginning of HandleChildCrash(), aren't you? /* Do nothing if the child terminated due to immediate shutdown */ if (Shutdown == ImmediateShutdown) return; If my memory is correct, this was necessary; without this, HandleChildCrash() or LogChildExit() left extra messages in the server log. LOG: %s (PID %d) exited with exit code %d LOG: terminating any other active server processes These messages are output because postmaster sends SIGQUIT to its children and wait for them to terminate. The children exit with non-zero status when they receive SIGQUIT. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)
On Wed, Jun 19, 2013 at 2:42 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: number of transactions actually processed: 301921 Just a thought before spending too much time on this subtle issue. The patch worked reasonnably for 301900 transactions in your above run, and the few last ones, less than the number of clients, show strange latency figures which suggest that something is amiss in some corner case when pgbench is stopping. However, the point of pgbench is to test a steady state, not to achieve the cleanest stop at the end of a run. So my question is: should this issue be a blocker wrt to the feature? I think so. If it doesn't get fixed now, it's not likely to get fixed later. And the fact that nobody understands why it's happening is kinda worrisome... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trgm regex index peculiarity
On Fri, June 21, 2013 15:11, Alexander Korotkov wrote: On Fri, Jun 21, 2013 at 2:40 PM, Erik Rijkers e...@xs4all.nl wrote: On Fri, June 21, 2013 05:25, Tom Lane wrote: Erik Rijkers e...@xs4all.nl writes: In a 112 MB test table (containing random generated text) with a trgm index (gin_trgm_ops), I consistently get these timings: select txt from azjunk6 where txt ~ '^abcd'; 130 ms select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd'; 3 ms Regex '^abcd' will be expanded into trigrams '__a', '_ab', 'abc' and 'bcd'. However trigrams '__a' is much more frequent than '_ab' which in turn is much more frequent than 'abc' and 'bcd'. Ommiting of ^ leads to ommiting of '__a' and '_ab' and that gives so significant speedup. [trgm_regex_optimize.1.patch ] Yes, that fixes the problem, thanks. Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 2013-06-21 20:54:34 +0900, Michael Paquier wrote: On Fri, Jun 21, 2013 at 6:19 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-06-19 09:55:24 +0900, Michael Paquier wrote: @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, Is it actually possible to get here with multiple toast indexes? Actually it is possible. finish_heap_swap is also called for example in ALTER TABLE where rewriting the table (phase 3), so I think it is better to protect this code path this way. But why would we copy invalid toast indexes over to the new relation? Shouldn't the new relation have been freshly built in the previous steps? What do you think about that? Using only the first valid index would be enough? What I am thinking about is the following: When we rewrite a relation, we build a completely new toast relation. Which will only have one index, right? So I don't see how this could could be correct if we deal with multiple indexes. In fact, the current patch's swap_relation_files throws an error if there are multiple ones around. diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h index 8ac2549..31309ed 100644 --- a/src/include/utils/relcache.h +++ b/src/include/utils/relcache.h @@ -29,6 +29,16 @@ typedef struct RelationData *Relation; typedef Relation *RelationPtr; /* + * RelationGetIndexListIfValid + * Get index list of relation without recomputing it. + */ +#define RelationGetIndexListIfValid(rel) \ +do { \ + if (rel-rd_indexvalid == 0) \ + RelationGetIndexList(rel); \ +} while(0) Isn't this function misnamed and should be RelationGetIndexListIfInValid? When naming that; I had more in mind: get the list of indexes if it is already there. It looks more intuitive to my mind. I can't follow. RelationGetIndexListIfValid() doesn't return anything. And it doesn't do anything if the list is already valid. It only does something iff the list currently is invalid. In this case RelationGetIndexListIfInvalid? Yep. Suggested that above ;). Maybe RelationFetchIndexListIfInvalid()? Hm. Looking at how this is currently used - I am afraid it's not correct... the reason RelationGetIndexList() returns a copy is that cache invalidations will throw away that list. And you do index_open() while iterating over it which will accept invalidation messages. Mybe it's better to try using RelationGetIndexList directly and measure whether that has a measurable impact= Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in CASE evaluation
On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote: Andres Freund wrote: Yes, I think it's pretty clearly a bug - Tom doesn't seem think so though. If we can agree it is, the fix outlined over on -bugs seems to be easily enough implemented... If you refer to this: On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote: So it seems we need to stop processing after finding a single WHEN that's not const? Does anybody have a better idea? eval_const_expressions() is not just an optimization: it performs mandatory transformations such as the conversion of named-argument function calls to positional function calls. Even if you could skip it, queries with expensive constant expressions would notice the performance loss. The situations helped by a change like this are too marginal to accept that cost. Would it work to run eval_const_expressions() lazily on THEN clauses? The plan-time eval_const_expressions() would not descend to them. The first ExecEvalCase() to use a given THEN clause would run eval_const_expressions() before proceeding. I think that it is surprising behaviour. That's about how I characterize it, too. I question whether real applications care. It's important to have CASE usable for avoiding data-dependent errors, but what's the use of including in your query an expression that can do nothing but throw an error? Does anyone have a real-world example? Perhaps some generated-query scenario. That being said, if we discover a simple-enough fix that performs well, we may as well incorporate it. If fixing the behaviour is undesirable, at least the documentation should be fixed. A brief documentation mention sounds fine. Perhaps add a paragraph on constant folding in general and reference that from the CASE page. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
From: Alvaro Herrera alvhe...@2ndquadrant.com Actually, I think it would be cleaner to have a new state in pmState, namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT. When we're in this state, postmaster is only waiting for the timeout to expire; and when it does, it sends SIGKILL and exits. Pretty much the same you have, except that instead of checking AbortStartTime we check the pmState variable. Are you suggesting simplifying the following part in ServerLoop()? I welcome the idea if this condition becomes simpler. However, I cannot imagine how. if (AbortStartTime 0 /* SIGKILL only once */ (Shutdown == ImmediateShutdown || (FatalError !SendStop)) now - AbortStartTime = 10) { SignalAllChildren(SIGKILL); AbortStartTime = 0; } I thought of adding some new state of pmState for some reason (that might be the same as your idea). But I refrained from doing that, because pmState has already many states. I was afraid adding a new pmState value for this bug fix would complicate the state management (e.g. state transition in PostmasterStateMachine()). In addition, I felt PM_WAIT_BACKENDS was appropriate because postmaster is actually waiting for backends to terminate after sending SIGQUIT. The state name is natural. I don't have strong objection to your idea if it makes the code cleaner and more understandable. Thank you very much. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in CASE evaluation
On 2013-06-21 09:51:05 -0400, Noah Misch wrote: On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote: Andres Freund wrote: Yes, I think it's pretty clearly a bug - Tom doesn't seem think so though. If we can agree it is, the fix outlined over on -bugs seems to be easily enough implemented... If you refer to this: On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote: So it seems we need to stop processing after finding a single WHEN that's not const? Does anybody have a better idea? eval_const_expressions() is not just an optimization: it performs mandatory transformations such as the conversion of named-argument function calls to positional function calls. Ah yes. Forgot about that... Scrap that. Although it surely isn't nice that all that is done in a function calleed eval_const_expressions()... Even if you could skip it, queries with expensive constant expressions would notice the performance loss. The situations helped by a change like this are too marginal to accept that cost. I have to say, that argument doesn't excite me mu8ch. It's not like we don't want to do the constant expression evaluation at all anymore. Just not inside CASE WHEN blocks which already are barring some optimizations anyway... Would it work to run eval_const_expressions() lazily on THEN clauses? The plan-time eval_const_expressions() would not descend to them. The first ExecEvalCase() to use a given THEN clause would run eval_const_expressions() before proceeding. Ugh. Doesn't sound nice. I don't have any better ideas than to actually split eval_const_expressions into one function that does the necessary things like canonicalization of AND/OR and one that actually evaluates expressions inside though. So maybe that's the way to go :/ I think that it is surprising behaviour. That's about how I characterize it, too. I question whether real applications care. It's important to have CASE usable for avoiding data-dependent errors, but what's the use of including in your query an expression that can do nothing but throw an error? Does anyone have a real-world example? Perhaps some generated-query scenario. It doesn't need to be an actual constant. Something that evaluates to the value at plan time is enough: PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END; EXECUTE foo(0); That example will most likely only crashes in 9.2+ because it will replan it with the acutal parameter values in place. But you could have the same in earlier versions e.g. using PQExecParams(), but that's harder to demonstrate. Now, that example only crashes because one place uses (SELECT $1) and the other doesn't, but... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in CASE evaluation
On 2013-06-21 09:51:05 -0400, Noah Misch wrote: That being said, if we discover a simple-enough fix that performs well, we may as well incorporate it. What about passing another parameter down eval_const_expressions_mutator (which is static, so changing the API isn't a problem) that basically tells us whether we actually should evaluate expressions or just perform the transformation? There's seems to be basically a couple of places where we call dangerous things: * simplify_function (via -evaluate_function-evaluate_expr) which is called in a bunch of places * evaluate_expr which is directly called in T_ArrayExpr T_ArrayCoerceExpr All places I've inspected so far need to deal with simplify_function returning NULL anyway, so that seems like a viable fix. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in CASE evaluation
2013/6/21 Andres Freund and...@2ndquadrant.com: On 2013-06-21 09:51:05 -0400, Noah Misch wrote: On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote: Andres Freund wrote: Yes, I think it's pretty clearly a bug - Tom doesn't seem think so though. If we can agree it is, the fix outlined over on -bugs seems to be easily enough implemented... If you refer to this: On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote: So it seems we need to stop processing after finding a single WHEN that's not const? Does anybody have a better idea? eval_const_expressions() is not just an optimization: it performs mandatory transformations such as the conversion of named-argument function calls to positional function calls. Ah yes. Forgot about that... Scrap that. Although it surely isn't nice that all that is done in a function calleed eval_const_expressions()... Even if you could skip it, queries with expensive constant expressions would notice the performance loss. The situations helped by a change like this are too marginal to accept that cost. yes, I dislike it too - then we can have inconsistent behave of constant between CASE and other statements. We should to do without any performance lost, if we do some changes in this area. Regards Pavel I have to say, that argument doesn't excite me mu8ch. It's not like we don't want to do the constant expression evaluation at all anymore. Just not inside CASE WHEN blocks which already are barring some optimizations anyway... Would it work to run eval_const_expressions() lazily on THEN clauses? The plan-time eval_const_expressions() would not descend to them. The first ExecEvalCase() to use a given THEN clause would run eval_const_expressions() before proceeding. Ugh. Doesn't sound nice. I don't have any better ideas than to actually split eval_const_expressions into one function that does the necessary things like canonicalization of AND/OR and one that actually evaluates expressions inside though. So maybe that's the way to go :/ I think that it is surprising behaviour. That's about how I characterize it, too. I question whether real applications care. It's important to have CASE usable for avoiding data-dependent errors, but what's the use of including in your query an expression that can do nothing but throw an error? Does anyone have a real-world example? Perhaps some generated-query scenario. It doesn't need to be an actual constant. Something that evaluates to the value at plan time is enough: PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END; EXECUTE foo(0); That example will most likely only crashes in 9.2+ because it will replan it with the acutal parameter values in place. But you could have the same in earlier versions e.g. using PQExecParams(), but that's harder to demonstrate. Now, that example only crashes because one place uses (SELECT $1) and the other doesn't, but... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in CASE evaluation
On Fri, Jun 21, 2013 at 04:12:32PM +0200, Andres Freund wrote: On 2013-06-21 09:51:05 -0400, Noah Misch wrote: On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote: Even if you could skip it, queries with expensive constant expressions would notice the performance loss. The situations helped by a change like this are too marginal to accept that cost. I have to say, that argument doesn't excite me mu8ch. It's not like we don't want to do the constant expression evaluation at all anymore. Just not inside CASE WHEN blocks which already are barring some optimizations anyway... Sure, it's a narrow loss. Before introducing a new narrow loss to fix an existing one, we should consider which loss hurts more. Offhand, I sympathize with the folks who would lose performance more than with the folks who want to write the sorts of expressions under consideration. Would it work to run eval_const_expressions() lazily on THEN clauses? The plan-time eval_const_expressions() would not descend to them. The first ExecEvalCase() to use a given THEN clause would run eval_const_expressions() before proceeding. Ugh. Doesn't sound nice. Would you elaborate? I question whether real applications care. It's important to have CASE usable for avoiding data-dependent errors, but what's the use of including in your query an expression that can do nothing but throw an error? Does anyone have a real-world example? Perhaps some generated-query scenario. It doesn't need to be an actual constant. Something that evaluates to the value at plan time is enough: PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END; EXECUTE foo(0); Now, that example only crashes because one place uses (SELECT $1) and the other doesn't, but... Not the real-world I was hoping for, but fair enough. Crash in this context means raise an error, right? -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut pete...@gmx.net wrote: On 6/7/13 12:14 AM, Amit Kapila wrote: I will change the patch as per below syntax if there are no objections: ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'}; I do like using ALTER SYSTEM in general, but now that I think about it, the 9.3 feature to create global settings in pg_db_role_setting would also have been a candidate for exactly that same syntax if it had been available. In fact, if we do add ALTER SYSTEM, it might make sense to recast that feature into that syntax. It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE or something like that. It's only a small syntax change, so don't worry about it too much, but let's keep thinking about it. I think that anything we want to add should either go before SET or after the value, such as: ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf'; ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file'; ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET configuration_parameter = 'file'; I agree we shouldn't back ourselves into a syntactic corner. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add visibility map information to pg_freespace.
On Wed, Jun 19, 2013 at 11:26 PM, Satoshi Nagayasu sn...@uptime.jp wrote: - pageinspect provies several functions for debugging purpose. - pg_freespace provies a view for monitoring purpose. - pgstattuple provies several functions for collecting specific table/index statistics. I think we should be careful to think about upgrade considerations when adding this functionality. Bumping the module version number to add new functions is less likely to break things for users than changing the return value of an existing SRF. Maybe that's too far down in the weeds to worry about, but it's a thought - especially for pg_freespace, where there's no real efficiency benefit to have the same function look at the FSM and the VM anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I will go with 5 seconds, then. I'm uncomfortable with this whole concept, and particularly with such a short timeout. On a very busy system, things can take a LOT longer than they think we should; it can take 30 seconds or more just to get a prompt back from a shell command. 5 seconds is the blink of an eye. More generally, what do we think the point is of sending SIGQUIT rather than SIGKILL in the first place, and why does that point cease to be valid after 5 seconds? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Fri, 2013-06-21 at 09:21 -0400, Robert Haas wrote: The other question here is - do we actually have the grammar right? As in, is this actually the syntax we're supposed to be implementing? It looks different from what's shown here, where the IGNORE NULLS is inside the function's parentheses, rather than afterwards: http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html IBM seems to think it's legal either inside or outside the parentheses: http://pic.dhe.ibm.com/infocenter/informix/v121/index.jsp?topic=%2Fcom.ibm.sqls.doc%2Fids_sqs_2594.htm The spec seems pretty clear that it falls outside of the parentheses, unless I am missing something. Regardless of what syntax we settle on, we should also make sure that the conflict is intrinsic to the grammar and can't be factored out, as Tom suggested upthread. It's not obvious to me what the actual ambiguity is here. If you've seen select lag(num,0) and the lookahead token is respect, what's the problem? It sort of looks like it could be a column label, but not even unreserved keywords can be column labels, so that's not it. Probably deserves a bit more investigation... I think the problem is when the function is used as a table function; e.g.: SELECT * FROM generate_series(1,10) respect; We could just add additional, optional Boolean argument to the existing functions. It's non-standard, but we avoid adding keywords. I thought about that, but it is awkward because the argument would have to be constant (or, if not, it would be quite strange). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?
Le vendredi 21 juin 2013 03:32:33, Josh Berkus a écrit : Hackers, So, I can create a custom aggregate first and do this: SELECT first(val order by ts desc) ... And I can do this: SELECT first_value(val) OVER (order by ts desc) ... but I can't do this: SELECT first_value(val order by ts desc) ... even though under the hood, it's the exact same operation. First I'm not sure it is the same, in a window frame you have the notion of peer-rows (when you use ORDER BY). And also, first_value is a *window* function, not a simple aggregate function... See this example: # create table foo (i int, t timestamptz); # insert into foo select n, now() from generate_series(1,10) g(n); # select i, first_value(i) over (order by t desc) from foo; # select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING and UNBOUNDED FOLLOWING) from foo; What do you expect SELECT first(val order by ts desc) to output ? -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
[HACKERS] Hardware donation
We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). -- Jim Nasby, Lead Data Architect (512) 569-9461 (primary) (512) 579-9024 (backup) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
On Fri, Jun 21, 2013 at 10:18 PM, Jim Nasby jna...@enova.com wrote: We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). Woot! -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
I stand corrected... we don't have a 512G server available. We do have plenty of 192G and 96G servers though if 2 of those would be of use. Sorry for the noise. On 6/21/13 11:48 AM, Jim Nasby wrote: We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). -- Jim Nasby, Lead Data Architect (512) 569-9461 (primary) (512) 579-9024 (backup) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Sat, Jun 22, 2013 at 12:11 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut pete...@gmx.net wrote: On 6/7/13 12:14 AM, Amit Kapila wrote: I will change the patch as per below syntax if there are no objections: ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'}; I do like using ALTER SYSTEM in general, but now that I think about it, the 9.3 feature to create global settings in pg_db_role_setting would also have been a candidate for exactly that same syntax if it had been available. In fact, if we do add ALTER SYSTEM, it might make sense to recast that feature into that syntax. It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE or something like that. It's only a small syntax change, so don't worry about it too much, but let's keep thinking about it. I think that anything we want to add should either go before SET or after the value, such as: ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf'; ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file'; ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET configuration_parameter = 'file'; I agree we shouldn't back ourselves into a syntactic corner. Maybe this idea has been already discussed, but why don't we just add the function like update_config_file(parameter, value)? We can commit the core of the patch with that function API first, and then we can discuss the syntax and add another API like ALTER SYSTEM later. We now have set_config() function to set the parameter, so adding the function for this feature should not be a surprise. If the name of the function is, for example, update_conf_file, most users would think that it's intuitive even if SIGHUP is not automatically sent immediately. We don't need to emit the message like 'This setting change will not be loaded until SIGHUP'. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?
Cédric Villemain-2 wrote And also, first_value is a *window* function, not a simple aggregate function... Per the documentation any aggregate function can be used with a WINDOW declaration. The logical question is why are window aggregates special so that the reverse cannot be true? In other words why is not every function simply defined as a normal aggregate that can be used in both contexts? See this example: # create table foo (i int, t timestamptz); # insert into foo select n, now() from generate_series(1,10) g(n); # select i, first_value(i) over (order by t desc) from foo; # select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING and UNBOUNDED FOLLOWING) from foo; What do you expect SELECT first(val order by ts desc) to output ? Undefined due to incorrect specificity of the ORDER BY definition. The window version has the same issue. The window aggregates should simply treat the entire input set as the relevant frame - basically the same output as would result from (simplistically): SELECT window_agg(...) FROM ( SELECT id, window_agg(...) OVER (ORDER BY id ASC) ORDER BY id ASC ) agg ORDER BY id DESC LIMIT 1 Admittedly this really only makes sense for first_value, last_value, and nth_value; the other window aggregates can return valid values but to have meaning they really need to be output in a windowing context. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Why-can-t-I-use-windowing-functions-over-ordered-aggregates-tp5760233p5760358.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?
Cedric, See this example: # create table foo (i int, t timestamptz); # insert into foo select n, now() from generate_series(1,10) g(n); # select i, first_value(i) over (order by t desc) from foo; # select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING and UNBOUNDED FOLLOWING) from foo; What do you expect SELECT first(val order by ts desc) to output ? Ah, right, I see what you mean. Yeah, I was doing queries without peer rows, so it looked the same to me, and it uses some of the same machinery. But of course it's not completely the same. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Review [was Re: [HACKERS] MD5 aggregate]
On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: On 15 June 2013 10:22, Dean Rasheed dean.a.rash...@gmail.com wrote: There seem to be 2 separate directions that this could go, which really meet different requirements: 1). Produce an unordered sum for SQL to compare 2 tables regardless of the order in which they are scanned. A possible approach to this might be something like an aggregate md5_total(text/bytea) returns text that returns the sum of the md5 values of each input value, treating each md5 value as an unsigned 128-bit integer, and then producing the hexadecimal representation of the final sum. This should out-perform a solution based on numeric addition, and in typical cases, the result wouldn't be much longer than a regular md5 sum, and so would be easy to eyeball for differences. I've been playing around with the idea of an aggregate that computes the sum of the md5 hashes of each of its inputs, which I've called md5_total() for now, although I'm not particularly wedded to that name. Comparing it with md5_agg() on a 100M row table (see attached test script) produces interesting results: SELECT md5_agg(foo.*::text) FROM (SELECT * FROM foo ORDER BY id) foo; 50bc42127fb9b028c9708248f835ed8f Time: 92960.021 ms SELECT md5_total(foo.*::text) FROM foo; 02faea7fafee4d253fc94cfae031afc43c03479c Time: 96190.343 ms Unlike md5_agg(), it is no longer a true MD5 sum (for one thing, its result is longer) but it seems like it would be very useful for quickly comparing data in SQL, since its value is not dependent on the row-order making it easier to use and better performing if there is no usable index for ordering. Note, however, that if there is an index that can be used for ordering, the performance is not necessarily better than md5_agg(), as this example shows. There is a small additional overhead per row for initialising the MD5 sums, and adding the results to the total, but I think the biggest factor is that md5_total() is processing more data. The reason is that MD5 works on 64-byte blocks, so the total amount of data going through the core MD5 algorithm is each row's size is rounded up to a multiple of 64. In this simple case it ends up processing around 1.5 times as much data: SELECT sum(length(foo.*::text)) AS md5_agg, sum(((length(foo.*::text)+63)/64)*64) AS md5_total FROM foo; md5_agg | md5_total +- 8103815438 | 12799909248 although of course that overhead won't be as large on wider tables, and even in this case the overall performance is still on a par with md5_agg(). ISTM that both aggregates are potentially useful in different situations. I would probably typically use md5_total() because of its simplicity/order-independence and consistent performance, but md5_agg() might also be useful when comparing with external data. Regards, Dean Submission review: Is the patch in a patch format which has context? (eg: context diff format) Yes. Does it apply cleanly to the current git master? Yes. Does it include reasonable tests, necessary doc patches, etc? Yes. Usability review: Does the patch actually implement that? Yes. Do we want that? Enough do. Do we already have it? No. Does it follow SQL spec, or the community-agreed behavior? No, and yes, respectively. Does it include pg_dump support (if applicable)? N/A Are there dangers? Patch changes the MD5 implementation, which could theoretically result in backward incompatibility if the changes are not 100% backward-compatible. Have all the bases been covered? Yes. Feature test: Does the feature work as advertised? Yes. Are there corner cases the author has failed to consider? Not that I've found so far. Are there any assertion failures or crashes? No. Performance review (skills needed: ability to time performance) Does the patch slow down simple tests? Yes. For an MD5 checksum of some random data, here are results from master: shackle@postgres:5493=# WITH t1 AS (SELECT string_agg(chr(floor(255*random()+1)::int),'') AS a FROM generate_series(1,1)), postgres-# t2 AS (SELECT a FROM t1 CROSS JOIN generate_series(1,1)) postgres-# select md5(a::text) FROM t2; shackle@postgres:5493=# \timing Timing is on. shackle@postgres:5493=# \g Time: 955.393 ms shackle@postgres:5493=# \g Time: 950.909 ms shackle@postgres:5493=# \g Time: 947.955 ms shackle@postgres:5493=# \g Time: 946.193 ms shackle@postgres:5493=# \g Time: 950.591 ms shackle@postgres:5493=# \g Time: 952.346 ms
[HACKERS] Unaccent performance
Hi, The unaccent extension is great, especially with its customisability, but it's not always easy to recommend. I witnessed a customer using no less than 56 nested replace functions in an SQL function. I looked to see how much this can be mitigated by unaccent. It turns out that not all the characters they were replacing can be replaced by unaccent, either because they replace more than 1 character at a time, or the character they're replacing, for some reason, isn't processed by unaccent, even with a custom rules file. So there were 20 characters I could identify that they were replacing. I made a custom rules file and compared its performance to the difficult-to-manage set of nested replace calls. CREATE OR REPLACE FUNCTION public.myunaccent(sometext text) RETURNS text LANGUAGE sql IMMUTABLE AS $function$ SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(sometext,'ą','a'),'Ą','A'),'ă','a'),'Ă','A'),'ā','a'),'Ā','A'),'æ','a'),'å','a'),'ä','a'),'ã','a'),'â','a'),'á','a'),'à','a'),'Æ','A'),'Å','A'),'Ä','A'),'Ã','A'),'Â','A'),'Á','A'),'À','A') ; $function$ postgres=# SELECT myunaccent(sometext::text) FROM (SELECT 'ÀÁÂÃÄÅÆàáâãäåæĀāĂ㥹' sometext FROM generate_series(1,100)) x OFFSET 99 LIMIT 1; myunaccent -- AAAaaaAaAaAa (1 row) Time: 726.282 ms postgres=# SELECT unaccent(sometext::text) FROM (SELECT 'ÀÁÂÃÄÅÆàáâãäåæĀāĂ㥹' sometext FROM generate_series(1,100)) x OFFSET 99 LIMIT 1; unaccent -- AAAaaaAaAaAa (1 row) Time: 3305.252 ms The timings are actually pretty much the same even if I introduce 187 nested replace calls for every line in the unaccent.rules file for 187 characters. But the same character set with unaccent increases to 7418.526 ms with the same type of query as above. That's 10 times more expensive. Is there a way to boost the performance to make its adoption more palatable? -- Thom
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis pg...@j-davis.com wrote: Regardless of what syntax we settle on, we should also make sure that the conflict is intrinsic to the grammar and can't be factored out, as Tom suggested upthread. It's not obvious to me what the actual ambiguity is here. If you've seen select lag(num,0) and the lookahead token is respect, what's the problem? It sort of looks like it could be a column label, but not even unreserved keywords can be column labels, so that's not it. Probably deserves a bit more investigation... I think the problem is when the function is used as a table function; e.g.: SELECT * FROM generate_series(1,10) respect; Ah ha. Well, there's probably not much help for that. Disallowing keywords as table aliases would be a cure worse than the disease, I think. I suppose the good news is that there probably aren't many people using RESPECT as a column name, but I have a feeling that we're almost certain to get complaints about reserving IGNORE. I think that will have to be quoted as a PL/pgsql variable name as well. :-( We could just add additional, optional Boolean argument to the existing functions. It's non-standard, but we avoid adding keywords. I thought about that, but it is awkward because the argument would have to be constant (or, if not, it would be quite strange). True... but e.g. string_agg() has the same issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Fri, Jun 21, 2013 at 12:56 PM, Fujii Masao masao.fu...@gmail.com wrote: On Sat, Jun 22, 2013 at 12:11 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut pete...@gmx.net wrote: On 6/7/13 12:14 AM, Amit Kapila wrote: I will change the patch as per below syntax if there are no objections: ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'}; I do like using ALTER SYSTEM in general, but now that I think about it, the 9.3 feature to create global settings in pg_db_role_setting would also have been a candidate for exactly that same syntax if it had been available. In fact, if we do add ALTER SYSTEM, it might make sense to recast that feature into that syntax. It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE or something like that. It's only a small syntax change, so don't worry about it too much, but let's keep thinking about it. I think that anything we want to add should either go before SET or after the value, such as: ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf'; ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file'; ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET configuration_parameter = 'file'; I agree we shouldn't back ourselves into a syntactic corner. Maybe this idea has been already discussed, but why don't we just add the function like update_config_file(parameter, value)? We can commit the core of the patch with that function API first, and then we can discuss the syntax and add another API like ALTER SYSTEM later. We now have set_config() function to set the parameter, so adding the function for this feature should not be a surprise. If the name of the function is, for example, update_conf_file, most users would think that it's intuitive even if SIGHUP is not automatically sent immediately. We don't need to emit the message like 'This setting change will not be loaded until SIGHUP'. I could certainly support that plan. I'm satisfied with the ALTER SYSTEM syntax and feel that might find other plausible uses, but functional notation would be OK with me, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Thu, Jun 20, 2013 at 12:18 AM, Amit Kapila amit.kap...@huawei.com wrote: Auto.conf- 1 Vote (Josh) System.auto.conf - 1 Vote (Josh) Postgresql.auto.conf - 2 Votes (Zoltan, Amit) Persistent.auto.conf - 0 Vote generated_by_server.conf - 1 Vote (Peter E) System.conf - 1 Vote (Magnus) alter_system.conf- 1 Vote (Alvaro) I had consolidated the list, so that it would be helpful for committer to decide among proposed names for this file. I'll also vote for postgresql.auto.conf. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
On 06/21/2013 09:48 AM, Jim Nasby wrote: We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). I'm sure we could use these for the performance test farm. If we need to replace some of the drives, the community has money for that. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
On 06/21/2013 09:48 AM, Jim Nasby wrote: We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). A couple 192G machines to put in the performance lab would be nice, especially if they have SSD. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XLogInsert scaling, revisited
On Tue, Jun 18, 2013 at 11:28 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 21:17, Jeff Janes wrote: Hi Heikki, I am getting conflicts applying version 22 of this patch to 9.4dev. Could you rebase? Here you go. I think I'm getting an undetected deadlock between the checkpointer and a user process running a TRUNCATE command. This is the checkpointer: #0 0x003a73eeaf37 in semop () from /lib64/libc.so.6 #1 0x005ff847 in PGSemaphoreLock (sema=0x7f8c0a4eb730, interruptOK=0 '\000') at pg_sema.c:415 #2 0x004b0abf in WaitOnSlot (upto=416178159648) at xlog.c:1775 #3 WaitXLogInsertionsToFinish (upto=416178159648) at xlog.c:2086 #4 0x004b657a in CopyXLogRecordToWAL (write_len=32, isLogSwitch=1 '\001', rdata=0x0, StartPos=value optimized out, EndPos=416192397312) at xlog.c:1389 #5 0x004b6fb2 in XLogInsert (rmid=0 '\000', info=value optimized out, rdata=0x7fff0020) at xlog.c:1209 #6 0x004b7644 in RequestXLogSwitch () at xlog.c:8748 #7 0x00611be3 in CheckArchiveTimeout () at checkpointer.c:622 #8 0x00611d97 in CheckpointWriteDelay (flags=value optimized out, progress=value optimized out) at checkpointer.c:705 #9 0x0062ec5a in BufferSync (flags=64) at bufmgr.c:1322 #10 CheckPointBuffers (flags=64) at bufmgr.c:1828 #11 0x004b1393 in CheckPointGuts (checkPointRedo=416178159592, flags=64) at xlog.c:8365 #12 0x004b8ff3 in CreateCheckPoint (flags=64) at xlog.c:8148 #13 0x006121c3 in CheckpointerMain () at checkpointer.c:502 #14 0x004c4c4a in AuxiliaryProcessMain (argc=2, argv=0x7fff21c4a5d0) at bootstrap.c:439 #15 0x0060a68c in StartChildProcess (type=CheckpointerProcess) at postmaster.c:4954 #16 0x0060d1ea in reaper (postgres_signal_arg=value optimized out) at postmaster.c:2571 #17 signal handler called #18 0x003a73ee14d3 in __select_nocancel () from /lib64/libc.so.6 #19 0x0060efee in ServerLoop (argc=value optimized out, argv=value optimized out) at postmaster.c:1537 #20 PostmasterMain (argc=value optimized out, argv=value optimized out) at postmaster.c:1246 #21 0x005ad4e0 in main (argc=3, argv=0x179fd00) at main.c:196 And this is the TRUNCATE command. #0 0x003a73eeaf37 in semop () from /lib64/libc.so.6 #1 0x005ff847 in PGSemaphoreLock (sema=0x7f8c0a4ea8d0, interruptOK=0 '\000') at pg_sema.c:415 #2 0x004b002d in WALInsertSlotAcquireOne (slotno=-1) at xlog.c:1667 #3 0x004b6d5d in XLogInsert (rmid=0 '\000', info=value optimized out, rdata=0x7fff21c4a5e0) at xlog.c:1115 #4 0x004b8abc in XLogPutNextOid (nextOid=67198981) at xlog.c:8704 #5 0x004a3bc2 in GetNewObjectId () at varsup.c:495 #6 0x004c5195 in GetNewRelFileNode (reltablespace=value optimized out, pg_class=0x0, relpersistence=value optimized out) at catalog.c:437 #7 0x0070f52d in RelationSetNewRelfilenode (relation=0x7f8c019cb2a0, freezeXid=2144367079, minmulti=1) at relcache.c:2758 #8 0x0055de61 in ExecuteTruncate (stmt=value optimized out) at tablecmds.c:1163 #9 0x00656080 in standard_ProcessUtility (parsetree=0x2058900, queryString=value optimized out, context=value optimized out, params=0x0, dest=value optimized out, completionTag=value optimized out) at utility.c:552 #10 0x00652a87 in PortalRunUtility (portal=0x17bf510, utilityStmt=0x2058900, isTopLevel=1 '\001', dest=0x2058c40, completionTag=0x7fff21c4a9a0 ) at pquery.c:1187 #11 0x006539fd in PortalRunMulti (portal=0x17bf510, isTopLevel=1 '\001', dest=0x2058c40, altdest=0x2058c40, completionTag=0x7fff21c4a9a0 ) at pquery.c:1318 #12 0x006540b3 in PortalRun (portal=0x17bf510, count=9223372036854775807, isTopLevel=1 '\001', dest=0x2058c40, altdest=0x2058c40, completionTag=0x7fff21c4a9a0 ) at pquery.c:816 #13 0x00650944 in exec_simple_query (query_string=0x2057e90 truncate foo) at postgres.c:1048 #14 0x00651fe9 in PostgresMain (argc=value optimized out, argv=value optimized out, dbname=0x1fc9e98 jjanes, username=value optimized out) at postgres.c:3985 #15 0x0060f80b in BackendRun (argc=value optimized out, argv=value optimized out) at postmaster.c:3987 #16 BackendStartup (argc=value optimized out, argv=value optimized out) at postmaster.c:3676 #17 ServerLoop (argc=value optimized out, argv=value optimized out) at postmaster.c:1577 #18 PostmasterMain (argc=value optimized out, argv=value optimized out) at postmaster.c:1246 #19 0x005ad4e0 in main (argc=3, argv=0x179fd00) at main.c:196 This is using the same testing harness as in the last round of this patch. Is there a way for me to dump the list of held/waiting lwlocks from gdb? Cheers, Jeff
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
Robert Haas robertmh...@gmail.com writes: More generally, what do we think the point is of sending SIGQUIT rather than SIGKILL in the first place, and why does that point cease to be valid after 5 seconds? Well, mostly it's about telling the client we're committing hara-kiri. Without that, there's no very good reason to run quickdie() at all. A practical issue with starting to send SIGKILL ourselves is that we will no longer be able to reflexively diagnose server process died on signal 9 as the linux OOM killer got you. I'm not at all convinced that the cases where SIGQUIT doesn't work are sufficiently common to justify losing that property. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
On 6/21/13 1:45 PM, Josh Berkus wrote: On 06/21/2013 09:48 AM, Jim Nasby wrote: We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). I'm sure we could use these for the performance test farm. If we need to replace some of the drives, the community has money for that. We might actually have some spare SSDs floating around; I'm checking. We're also thinking we might be able to get at least one of these up to 256G by swapping memory around. Am I correct that the most valuable thing to the community the large memory size? Who can be point of contact from the community to arrange shipping, etc? -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
On Fri, Jun 21, 2013 at 12:03 PM, Jim Nasby j...@nasby.net wrote: On 6/21/13 1:45 PM, Josh Berkus wrote: On 06/21/2013 09:48 AM, Jim Nasby wrote: We've got some recently decommissioned servers and Enova is willing to donate 2 of them to the community. There's nothing terribly spectacular about the servers except for memory. We have one 512G server available and the other would be either 192G or 96G. I know that folks already have access to machines with a lot of cores, but I haven't seen reports of large memory machines. CPU details vary but we're only looking at 20ish cores (though AFAIK they're all 4 socket servers if that matters). Local drives are nothing fancy (though some might possibly be SSD). I'm sure we could use these for the performance test farm. If we need to replace some of the drives, the community has money for that. We might actually have some spare SSDs floating around; I'm checking. We're also thinking we might be able to get at least one of these up to 256G by swapping memory around. Am I correct that the most valuable thing to the community the large memory size? Yeah, I believe it's memory and storage. Who can be point of contact from the community to arrange shipping, etc? I can be. Regards, Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Thanks! On Fri, Jun 21, 2013 at 4:35 AM, Hitoshi Harada umi.tan...@gmail.comwrote: On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer cr...@2ndquadrant.comwrote: On 06/21/2013 05:32 PM, Hitoshi Harada wrote: I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That should make sense for any type in which the concept of zero makes sense. Yeah, I mean, it needs to know if offset is negative or not by testing with zero. So we need zero value or is_negative function for each type. Thanks, -- Hitoshi Harada
Re: [HACKERS] GIN improvements part2: fast scan
On 19.06.2013 11:56, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 19.06.2013 11:30, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 23:59, Alexander Korotkov wrote: I would like to illustrate that on example. Imagine you have fulltext query rare_termfrequent_term. Frequent term has large posting tree while rare term has only small posting list containing iptr1, iptr2 and iptr3. At first we get iptr1 from posting list of rare term, then we would like to check whether we have to scan part of frequent term posting tree where iptr iptr1. So we call pre_consistent([false, true]), because we know that rare term is not present for iptriptr2. pre_consistent returns false. So we can start scanning frequent term posting tree from iptr1. Similarly we can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to maximum possible pointer. Thanks, now I understand the rare-term frequent-term problem. Couldn't you do that with the existing consistent function? I don't see why you need the new pre-consistent function for this. In the case of two entries I can. But in the case of n entries things becomes more complicated. Imagine you have term_1 term_2 ... term_n query. When you get some item pointer from term_1 you can skip all the lesser item pointers from term_2, term_3 ... term_n. But if all you have for it is consistent function you have to call it with following check arguments: 1) [false, false, false, ... , false] 2) [false, true, false, ... , false] 3) [false, false, true, ... , false] 4) [false, true, true, ..., false] .. i.e. you have to call it 2^(n-1) times. But if you know the query specific (i.e. in opclass) it's typically easy to calculate exactly what we need in single pass. That's why I introduced pre_consistent. Hmm. So how does that work with the pre-consistent function? Don't you need to call that 2^(n-1)-1 times as well? I call pre-consistent once with [false, true, true, ..., true]. Pre-consistent knows that each true passed to it could be false positive. So, if it returns false it guarantees that consistent will be false for all possible combinations. Ok, I see. I spent some time pondering this. I'd like to find a way to do something about this without requiring another user-defined function. A couple of observations: 1. The profile of that rare-term frequent-term quest, without any patch, looks like this: 28,55% postgres ginCompareItemPointers 19,36% postgres keyGetItem 15,20% postgres scanGetItem 7,75% postgres checkcondition_gin 6,25% postgres gin_tsquery_consistent 4,34% postgres TS_execute 3,85% postgres callConsistentFn 3,64% postgres FunctionCall8Coll 3,19% postgres check_stack_depth 2,60% postgres entryGetNextItem 1,35% postgres entryGetItem 1,25% postgres MemoryContextReset 1,12% postgres MemoryContextSwitchTo 0,31% libc-2.17.so __memcpy_ssse3_back 0,24% postgres collectMatchesForHeapRow I was quite surprised by seeing ginCompareItemPointers at the top. It turns out that it's relatively expensive to do comparisons in the format we keep item pointers, packed in 6 bytes, in 3 int16s. I hacked together a patch to convert ItemPointers into uint64s, when dealing with them in memory. That helped quite a bit. Another important thing in the above profile is that calling consistent-function is taking up a lot of resources. And in the example test case you gave, it's called with the same arguments every time. Caching the result of consistent-function would be a big win. I wrote a quick patch to do that caching, and together with the itempointer hack, I was able to halve the runtime of that test case. That's impressive, we probably should pursue that low-hanging fruit, but it's still slower than your fast scan patch by a factor of 100x. So clearly we do need an algorithmic improvement here, along the lines of your patch, or something similar. 2. There's one trick we could do even without the pre-consistent function, that would help the particular test case you gave. Suppose that you have two search terms A and B. If you have just called consistent on a row that matched term A, but not term B, you can skip any subsequent rows in the scan that match A but not B. That means that you can skip over to the next row that matches B. This is essentially the same thing you do with the pre-consistent function, it's just a degenerate case of it. That helps as long as the search contains only one frequent term, but if it contains multiple, then you have to still stop at every row that matches more than one of the frequent terms. 3. I'm still not totally convinced that we shouldn't just build the
Re: Review [was Re: [HACKERS] MD5 aggregate]
On Fri, Jun 21, 2013 at 10:48:35AM -0700, David Fetter wrote: On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: On 15 June 2013 10:22, Dean Rasheed dean.a.rash...@gmail.com wrote: There seem to be 2 separate directions that this could go, which really meet different requirements: 1). Produce an unordered sum for SQL to compare 2 tables regardless of the order in which they are scanned. A possible approach to this might be something like an aggregate md5_total(text/bytea) returns text that returns the sum of the md5 values of each input value, treating each md5 value as an unsigned 128-bit integer, and then producing the hexadecimal representation of the final sum. This should out-perform a solution based on numeric addition, and in typical cases, the result wouldn't be much longer than a regular md5 sum, and so would be easy to eyeball for differences. I've been playing around with the idea of an aggregate that computes the sum of the md5 hashes of each of its inputs, which I've called md5_total() for now, although I'm not particularly wedded to that name. Comparing it with md5_agg() on a 100M row table (see attached test script) produces interesting results: SELECT md5_agg(foo.*::text) FROM (SELECT * FROM foo ORDER BY id) foo; 50bc42127fb9b028c9708248f835ed8f Time: 92960.021 ms SELECT md5_total(foo.*::text) FROM foo; 02faea7fafee4d253fc94cfae031afc43c03479c Time: 96190.343 ms Unlike md5_agg(), it is no longer a true MD5 sum (for one thing, its result is longer) but it seems like it would be very useful for quickly comparing data in SQL, since its value is not dependent on the row-order making it easier to use and better performing if there is no usable index for ordering. Note, however, that if there is an index that can be used for ordering, the performance is not necessarily better than md5_agg(), as this example shows. There is a small additional overhead per row for initialising the MD5 sums, and adding the results to the total, but I think the biggest factor is that md5_total() is processing more data. The reason is that MD5 works on 64-byte blocks, so the total amount of data going through the core MD5 algorithm is each row's size is rounded up to a multiple of 64. In this simple case it ends up processing around 1.5 times as much data: SELECT sum(length(foo.*::text)) AS md5_agg, sum(((length(foo.*::text)+63)/64)*64) AS md5_total FROM foo; md5_agg | md5_total +- 8103815438 | 12799909248 although of course that overhead won't be as large on wider tables, and even in this case the overall performance is still on a par with md5_agg(). ISTM that both aggregates are potentially useful in different situations. I would probably typically use md5_total() because of its simplicity/order-independence and consistent performance, but md5_agg() might also be useful when comparing with external data. Regards, Dean Performance review (skills needed: ability to time performance) Does the patch slow down simple tests? Yes. For an MD5 checksum of some random data, here are results from master: shackle@postgres:5493=# WITH t1 AS (SELECT string_agg(chr(floor(255*random()+1)::int),'') AS a FROM generate_series(1,1)), postgres-# t2 AS (SELECT a FROM t1 CROSS JOIN generate_series(1,1)) postgres-# select md5(a::text) FROM t2; shackle@postgres:5493=# \timing Timing is on. shackle@postgres:5493=# \g Time: 955.393 ms shackle@postgres:5493=# \g Time: 950.909 ms shackle@postgres:5493=# \g Time: 947.955 ms shackle@postgres:5493=# \g Time: 946.193 ms shackle@postgres:5493=# \g Time: 950.591 ms shackle@postgres:5493=# \g Time: 952.346 ms shackle@postgres:5493=# \g Time: 948.623 ms shackle@postgres:5493=# \g Time: 939.819 ms and here from master + the patch: WITH t1 AS (SELECT string_agg(chr(floor(255*random()+1)::int),'') AS a FROM generate_series(1,1)), t2 AS (SELECT a FROM t1 CROSS JOIN generate_series(1,1)) select md5(a::text) FROM t2; Time: 1129.178 ms shackle@postgres:5494=# \g Time: 1134.013 ms shackle@postgres:5494=# \g Time: 1124.387 ms shackle@postgres:5494=# \g Time: 1119.733 ms shackle@postgres:5494=# \g Time: 1104.906 ms shackle@postgres:5494=# \g Time: 1137.055 ms shackle@postgres:5494=# \g Time: 1128.977 ms shackle@postgres:5494=# \g
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: More generally, what do we think the point is of sending SIGQUIT rather than SIGKILL in the first place, and why does that point cease to be valid after 5 seconds? Well, mostly it's about telling the client we're committing hara-kiri. Without that, there's no very good reason to run quickdie() at all. That's what I thought, too. It seems to me that if we think that's important, then it's important even if it takes more than 5 seconds for some reason. A practical issue with starting to send SIGKILL ourselves is that we will no longer be able to reflexively diagnose server process died on signal 9 as the linux OOM killer got you. I'm not at all convinced that the cases where SIGQUIT doesn't work are sufficiently common to justify losing that property. I'm not, either. Maybe this question will provoke many indignant responses, but who in their right mind even uses immediate shutdown on a production server with any regularity? The shutdown checkpoint is sometimes painfully long, but do you really want to run recovery just to avoid it? And in the rare case where an immediate shutdown fails to work, what's wrong will killall -9 postgres? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hardware donation
Who can be point of contact from the community to arrange shipping, etc? I can be. And I'll handle the tax credit once the servers are received. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Andres Freund escribió: On 2013-06-20 22:36:45 -0400, Alvaro Herrera wrote: If we leave postmaster running after SIGKILLing its children, the only thing we can do is have it continue to SIGKILL processes continuously every few seconds until they die (or just sit around doing nothing until they all die). I don't think this will have a different effect than postmaster going away trusting the first SIGKILL to do its job eventually. I think it should just wait till all its child processes are dead. No need to repeat sending the signals - as you say, that won't help. OK, I can buy that. So postmaster stays around waiting in ServerLoop until all children are gone; and if any persists for whatever reason, well, tough. What we could do to improve the robustness a bit - at least on linux - is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be killed if the postmaster goes away... This is an interesting idea (even if it has no relationship to the patch at hand). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
The case where I wanted routine shutdown immediate (and I'm not sure I ever actually got it) was when we were using IBM HA/CMP, where I wanted a terminate with a fair bit of prejudice. If we know we want to switch right away now, immediate seemed pretty much right. I was fine with interrupting user sessions, and there wasn't as much going on in the way of system background stuff back then. I wasn't keen on waiting on much of anything. The background writer ought to be keeping things from being too desperately out of date. If there's stuff worth waiting a few seconds for, I'm all ears. But if I have to wait arbitrarily long, colour me unhappy. If I have to distinguish, myself, between a checkpoint nearly done flushing and a backend that's stuck waiting forlornly for filesystem access, I'm inclined to kill -9 and hope recovery doesn't take *too* long on the next node... If shutting a server down in an emergency situation requires a DBA to look in, as opposed to init.d doing its thing, I think that's pretty much the same problem too.
[HACKERS] Fixed Cardinality estimation with equality predicates between column of the same table
Hi all, I see a strange behavior ( for me ) on 9.2 (but seems the same on 9.1 and 9.3) of the optimizer on query like that : /* create a table with random data and 2 rows */ create table test1 ( id int not null primary key, state1 int not null default 0, state2 int not null default 0, state3 int not null default 0 ); insert into test1 (id, state1, state2, state3) select i, (random()*3)::int, (random())::int, (random()*100)::int from generate_series (1, 2) as gs(i) ; analyze test1 ; /* between same columns */ explain select * from test1 where state1=state1 ; QUERY PLAN -- Seq Scan on test1 (cost=0.00..359.00 rows=100 width=16) Filter: (state1 = state1) (2 rows) test3=# explain select * from test1 where state2=state2 ; QUERY PLAN -- Seq Scan on test1 (cost=0.00..359.00 rows=100 width=16) Filter: (state2 = state2) (2 rows) /* between different columns of same table */ test3=# explain select * from test1 where state1=state2 ; QUERY PLAN -- Seq Scan on test1 (cost=0.00..359.00 rows=100 width=16) Filter: (state1 = state2) (2 rows) === /* create a table with random data and 10 rows to verify */ create table test2 ( id int not null primary key, state1 int not null default 0, state2 int not null default 0, state3 int not null default 0 ); insert into test2 (id, state1, state2, state3) select i, (random()*3)::int, (random())::int, (random()*100)::int from generate_series (1, 10) as gs(i) ; test3=# analyze test2 ; ANALYZE test3=# explain select * from test2 where state1=state3;QUERY PLAN --- Seq Scan on test2 (cost=0.00..1791.00 rows=500 width=16) Filter: (state1 = state3) (2 rows) test3=# explain select * from test2 where state1=state2; QUERY PLAN --- Seq Scan on test2 (cost=0.00..1791.00 rows=500 width=16) Filter: (state1 = state2) (2 rows) test3=# explain select * from test2 where state1=state1; QUERY PLAN --- Seq Scan on test2 (cost=0.00..1791.00 rows=500 width=16) Filter: (state1 = state1) (2 rows) It's seems always 0.5% of the rows , and it seems indipendent of the type of data you have in row : /*add a column where costant value named c3 */ alter table test1 add c3 int default 1 ; ALTER TABLE analyze test1 ; ANALYZE explain select * from test1 where state1=c3; QUERY PLAN -- Seq Scan on test1 (cost=0.00..378.00 rows=100 width=20) Filter: (state1 = c3) (2 rows) /*add a column where costant value named c3 */ alter table test2 add c3 int default 1 ; ALTER TABLE analyze test2 ; ANALYZE explain select * from test2 where state1=c3; QUERY PLAN --- Seq Scan on test2 (cost=0.00..1887.00 rows=500 width=20) Filter: (state1 = c3) (2 rows) /* add another constant column */ test3=# alter table test2 add c4 int default 1 ; ALTER TABLE test3=# analyze test2 ; ANALYZE test3=# explain select * from test2 where c3=c4 ; QUERY PLAN --- Seq Scan on test2 (cost=0.00..1887.00 rows=500 width=24) Filter: (c3 = c4) obviously the statistics are ok : Always 0.5%. Greetings Matteo
Re: [HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Alvaro Herrera alvhe...@2ndquadrant.com writes: Andres Freund escribió: What we could do to improve the robustness a bit - at least on linux - is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be killed if the postmaster goes away... This is an interesting idea (even if it has no relationship to the patch at hand). The traditional theory has been that that would be less robust, not more so. Child backends are (mostly) able to carry out queries whether or not the postmaster is around. True, you can't make new connections, but how does killing the existing children make that better? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unaccent performance
On 21 June 2013 19:04, Thom Brown t...@linux.com wrote: Hi, The unaccent extension is great, especially with its customisability, but it's not always easy to recommend. I witnessed a customer using no less than 56 nested replace functions in an SQL function. I looked to see how much this can be mitigated by unaccent. It turns out that not all the characters they were replacing can be replaced by unaccent, either because they replace more than 1 character at a time, or the character they're replacing, for some reason, isn't processed by unaccent, even with a custom rules file. So there were 20 characters I could identify that they were replacing. I made a custom rules file and compared its performance to the difficult-to-manage set of nested replace calls. CREATE OR REPLACE FUNCTION public.myunaccent(sometext text) RETURNS text LANGUAGE sql IMMUTABLE AS $function$ SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(sometext,'ą','a'),'Ą','A'),'ă','a'),'Ă','A'),'ā','a'),'Ā','A'),'æ','a'),'å','a'),'ä','a'),'ã','a'),'â','a'),'á','a'),'à','a'),'Æ','A'),'Å','A'),'Ä','A'),'Ã','A'),'Â','A'),'Á','A'),'À','A') ; $function$ postgres=# SELECT myunaccent(sometext::text) FROM (SELECT 'ÀÁÂÃÄÅÆàáâãäåæĀāĂ㥹' sometext FROM generate_series(1,100)) x OFFSET 99 LIMIT 1; myunaccent -- AAAaaaAaAaAa (1 row) Time: 726.282 ms postgres=# SELECT unaccent(sometext::text) FROM (SELECT 'ÀÁÂÃÄÅÆàáâãäåæĀāĂ㥹' sometext FROM generate_series(1,100)) x OFFSET 99 LIMIT 1; unaccent -- AAAaaaAaAaAa (1 row) Time: 3305.252 ms The timings are actually pretty much the same even if I introduce 187 nested replace calls for every line in the unaccent.rules file for 187 characters. But the same character set with unaccent increases to 7418.526 ms with the same type of query as above. That's 10 times more expensive. Is there a way to boost the performance to make its adoption more palatable? Another test passing in a string of 10 characters gives the following timings: unaccent: 240619.395 ms myunaccent: 785.505 ms I guess this must indicate that unaccent is processing all rows, and myunaccent is only being run on the 1 select row? I can't account for myunaccent always being almost the same duration regardless of string length otherwise. This is probably an incorrect assessment of performance. Another test inserting long text strings into a text column of a table 100,000 times, then updating another column to have that unaccented value using both methods: unaccent: 3867.306 ms myunaccent: 43611.732 ms So I guess this complaint about performance is all just noise. However, pushing that pointless complaint to one side, I would like to have the ability to have unaccent support more characters that it doesn't currently seem to support, such as bullet points, ellipses etc., and also more than 1 character being replaced. Naturally these aren't appropriate to fall under the unaccent function itself, but the rules file is good starting point. It would be a bit like translate, except it would use a rules file instead of providing strings of single characters to convert. So say we wanted (trademark) to be converted into ™ just as an example, or ; to .. We can't do that with unaccent, but in order to avoid a huge list of replace functions, a function like unaccent, with a few adaptations, would solve the problem. e.g.: SELECT transform(my_custom_dictionary, 'Commodore Amiga(trademark);') would return Commodore Amiga™. This would ideally somehow cater for replacing tabs and spaces too. -- Thom
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
Hello all I've been examining PostgreSQL to gain a greater understanding of RDBMS. (Thanks for a nice, very educational system!) In the process I've been looking into a few problems and the complications of this patch appeared relatively uninvolved, so I tried to look for a solution. I found the following: The grammar conflict appears to be because of ambiguities in: 1. table_ref (used exclusively in FROM clauses) 2. index_elem (used exclusively in INDEX creation statements). Now, this doesn't seem to make much sense, as AFAICT window functions are explicitly disallowed in these contexts (transformWindowFuncCall will yield errors, and I can't really wrap my head around what a window function call would mean there). I therefore propose a simple rearrangement of the grammar, syntactically disallowing window functions in the outer part of those contexts (a_expr's inside can't and shouldn't be done much about) which will allow both RESPECT and IGNORE to become unreserved keywords, without doing any lexer hacking or abusing the grammar. I've attached a patch which will add RESPECT NULLS and IGNORE NULLS to the grammar in the right place. Also the window frame options are set but nothing more, so this patch needs to be merged with Nicholas White's original patch. One problem I see with this approach to the grammar is that the error messages will change when putting window functions in any of the forbidden places. The new error messages are I think worse and less specific than the old ones. I suppose that can be fixed though, or maybe the problem isn't so severe. Example of old error message: window functions are not allowed in functions in FROM New error message: syntax error at or near OVER in addition I think the original patch as it stands has a few problems that I haven't seen discussed: 1. The result of the current patch using lead create table test_table ( id serial, val integer); insert into test_table (val) select * from unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]); select val, lead(val, 2) ignore nulls over (order by id) from test_table; val | lead -+-- 1 |3 2 |4 3 |4 4 |4 |4 |5 |6 5 |7 6 |7 7 |7 (10 rows) I would expect it to output: select val, lead(val, 2) ignore nulls over (order by id) from test_table; val | lead -+-- 1 |3 2 |4 3 |5 4 |6 |6 |6 |6 5 |7 6 | 7 | (10 rows) That is: skip two rows forward not counting null rows. The lag behavior works well as far as I can see. 2. It would be nice if an error was given when ignore nulls was used on a function for which it had no effect. Perhaps this should be up to the different window function themselves to do though. Apart from those points I think the original patch is nice and provides a functionality that's definitely nice to have. Kind Regards Troels Nielsen On Fri, Jun 21, 2013 at 8:11 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis pg...@j-davis.com wrote: Regardless of what syntax we settle on, we should also make sure that the conflict is intrinsic to the grammar and can't be factored out, as Tom suggested upthread. It's not obvious to me what the actual ambiguity is here. If you've seen select lag(num,0) and the lookahead token is respect, what's the problem? It sort of looks like it could be a column label, but not even unreserved keywords can be column labels, so that's not it. Probably deserves a bit more investigation... I think the problem is when the function is used as a table function; e.g.: SELECT * FROM generate_series(1,10) respect; Ah ha. Well, there's probably not much help for that. Disallowing keywords as table aliases would be a cure worse than the disease, I think. I suppose the good news is that there probably aren't many people using RESPECT as a column name, but I have a feeling that we're almost certain to get complaints about reserving IGNORE. I think that will have to be quoted as a PL/pgsql variable name as well. :-( We could just add additional, optional Boolean argument to the existing functions. It's non-standard, but we avoid adding keywords. I thought about that, but it is awkward because the argument would have to be constant (or, if not, it would be quite strange). True... but e.g. string_agg() has the same issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers respect_nulls_and_ignore_nulls_grammar.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixed Cardinality estimation with equality predicates between column of the same table
On 06/21/2013 02:33 PM, desmodemone wrote: Hi all, I see a strange behavior ( for me ) on 9.2 (but seems the same on 9.1 and 9.3) of the optimizer on query like that : Matteo, I just posted this on -performance. See Tom's answer. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: Andres Freund escribió: What we could do to improve the robustness a bit - at least on linux - is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be killed if the postmaster goes away... This is an interesting idea (even if it has no relationship to the patch at hand). The traditional theory has been that that would be less robust, not more so. Child backends are (mostly) able to carry out queries whether or not the postmaster is around. I think that's the Tom Lane theory. The Robert Haas theory is that if the postmaster has died, there's no reason to suppose that it hasn't corrupted shared memory on the way down, or that the system isn't otherwise heavily fuxxored in some way. True, you can't make new connections, but how does killing the existing children make that better? It allows you to start a new postmaster in a timely fashion, instead of waiting for an idle connection that may not ever terminate without operator intervention. Even if it were possible to start a new postmaster that attached to the existing shared memory segment and began spawning new children, I think I'd be heavily in favor of killing the old ones off first and doing a full system reset just for safety. But it isn't, so what you get is a crippled system that never recovers without operator intervention. And note that I'm not talking about pg_ctl restart; that fails because the children have the shmem segment still attached and the postmaster, which is the only thing listed in the PID file, is already dead. I'm talking about killall -9 postgres, at least. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementing incremental backup
On 20/06/2013 03:25, Tatsuo Ishii wrote: On Wed, Jun 19, 2013 at 8:40 PM, Tatsuo Ishii is...@postgresql.org wrote: On Wed, Jun 19, 2013 at 6:20 PM, Stephen Frost sfr...@snowman.net wrote: * Claudio Freire (klaussfre...@gmail.com) wrote: [...] The only bottleneck here, is WAL archiving. This assumes you can afford WAL archiving at least to a local filesystem, and that the WAL compressor is able to cope with WAL bandwidth. But I have no reason to think you'd be able to cope with dirty-map updates anyway if you were saturating the WAL compressor, as the compressor is more efficient on amortized cost per transaction than the dirty-map approach. Thank you for detailed explanation. I will think more about this. Just for the record, I was mulling over this idea since a bunch of month. I even talked about that with Dimitri Fontaine some weeks ago with some beers :) My idea came from a customer during a training explaining me the difference between differential and incremental backup in Oracle. My approach would have been to create a standalone tool (say pg_walaggregate) which takes a bunch of WAL from archives and merge them in a single big file, keeping only the very last version of each page after aggregating all their changes. The resulting file, aggregating all the changes from given WAL files would be the differential backup. A differential backup resulting from a bunch of WAL between W1 and Wn would help to recover much faster to the time of Wn than replaying all the WALs between W1 and Wn and saves a lot of space. I was hoping to find some time to dig around this idea, but as the subject rose here, then here are my 2¢! Cheers, -- Jehan-Guillaume (ioguix) de Rorthais -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
OK let's finalize this patch first. I'll try to send an updated patch within today. On Fri, Jun 21, 2013 at 10:47 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-06-21 20:54:34 +0900, Michael Paquier wrote: On Fri, Jun 21, 2013 at 6:19 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-06-19 09:55:24 +0900, Michael Paquier wrote: @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, Is it actually possible to get here with multiple toast indexes? Actually it is possible. finish_heap_swap is also called for example in ALTER TABLE where rewriting the table (phase 3), so I think it is better to protect this code path this way. But why would we copy invalid toast indexes over to the new relation? Shouldn't the new relation have been freshly built in the previous steps? What do you think about that? Using only the first valid index would be enough? What I am thinking about is the following: When we rewrite a relation, we build a completely new toast relation. Which will only have one index, right? So I don't see how this could could be correct if we deal with multiple indexes. In fact, the current patch's swap_relation_files throws an error if there are multiple ones around. Yes, OK. Let me have a look at the code of CLUSTER more in details before giving a precise answer, but I'll try to remove that renaming part. Btw, I'd like to add an assertion in the code at least to prevent wrong use of this code path. diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h index 8ac2549..31309ed 100644 --- a/src/include/utils/relcache.h +++ b/src/include/utils/relcache.h @@ -29,6 +29,16 @@ typedef struct RelationData *Relation; typedef Relation *RelationPtr; /* + * RelationGetIndexListIfValid + * Get index list of relation without recomputing it. + */ +#define RelationGetIndexListIfValid(rel) \ +do { \ + if (rel-rd_indexvalid == 0) \ + RelationGetIndexList(rel); \ +} while(0) Isn't this function misnamed and should be RelationGetIndexListIfInValid? When naming that; I had more in mind: get the list of indexes if it is already there. It looks more intuitive to my mind. I can't follow. RelationGetIndexListIfValid() doesn't return anything. And it doesn't do anything if the list is already valid. It only does something iff the list currently is invalid. In this case RelationGetIndexListIfInvalid? Yep. Suggested that above ;). Maybe RelationFetchIndexListIfInvalid()? Hm. Looking at how this is currently used - I am afraid it's not correct... the reason RelationGetIndexList() returns a copy is that cache invalidations will throw away that list. And you do index_open() while iterating over it which will accept invalidation messages. Mybe it's better to try using RelationGetIndexList directly and measure whether that has a measurable impact= Yes, I was wondering about potential memory leak that list_copy could introduce in tuptoaster.c when doing a bulk insert, that's the only reason why I added this macro. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] problem with commitfest redirection
When ever I try to see the patch from this commit it never loads: https://commitfest.postgresql.org/action/patch_view?id=1129 Some problem there? I can see other patches, from other commits. -- Martín Marquéshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
From: Robert Haas robertmh...@gmail.com On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I will go with 5 seconds, then. I'm uncomfortable with this whole concept, and particularly with such a short timeout. On a very busy system, things can take a LOT longer than they think we should; it can take 30 seconds or more just to get a prompt back from a shell command. 5 seconds is the blink of an eye. I'm comfortable with 5 seconds. We are talking about the interval between sending SIGQUIT to the children and then sending SIGKILL to them. In most situations, the backends should terminate immediately. However, as I said a few months ago, ereport() call in quickdie() can deadlock indefinitely. This is a PostgreSQL's bug. In addition, Tom san was concerned that some PLs (PL/Perl or PL/Python?) block SIGQUIT while executing the UDF, so they may not be able to respond to the immediate shutdown request. What DBAs want from pg_ctl stop -mi is to shutdown the database server as immediately as possible. So I think 5 second is reasonable. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
From: Robert Haas robertmh...@gmail.com On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: More generally, what do we think the point is of sending SIGQUIT rather than SIGKILL in the first place, and why does that point cease to be valid after 5 seconds? Well, mostly it's about telling the client we're committing hara-kiri. Without that, there's no very good reason to run quickdie() at all. That's what I thought, too. It seems to me that if we think that's important, then it's important even if it takes more than 5 seconds for some reason. I guess Tom san is saying we should be as kind as possible to the client, so try to notify the client of the shutdown. Not complete kindness. Because the DBA requested immediate shutdown by running pg_ctl stop -mi, the top priority is to shutdown the database server as immediately as possible. The idea here is to try to be friendly to the client as long as the DBA can stand. That's tthe 5 second. A practical issue with starting to send SIGKILL ourselves is that we will no longer be able to reflexively diagnose server process died on signal 9 as the linux OOM killer got you. I'm not at all convinced that the cases where SIGQUIT doesn't work are sufficiently common to justify losing that property. I'm not, either. Maybe this question will provoke many indignant responses, but who in their right mind even uses immediate shutdown on a production server with any regularity? The shutdown checkpoint is sometimes painfully long, but do you really want to run recovery just to avoid it? And in the rare case where an immediate shutdown fails to work, what's wrong will killall -9 postgres? Checkpoint is irrelevant here because we are discussing immediate shutdown. Some problems with killall -9 postgres are: 1. It's not available on Windows. 2. It may kill other database server instances running on the same machine. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with commitfest redirection
On Fri, Jun 21, 2013 at 8:56 PM, Martín Marqués mar...@2ndquadrant.com wrote: When ever I try to see the patch from this commit it never loads: https://commitfest.postgresql.org/action/patch_view?id=1129 Some problem there? I can see other patches, from other commits. Yes, the URL is wrong. right URL is http://www.postgresql.org/message-id/CAFjNrYuh=4Vwnv=2n7cj0jjuwc4hool1epxsoflj6s19u02...@mail.gmail.com -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with commitfest redirection
El 21/06/13 23:47, Jaime Casanova escribió: On Fri, Jun 21, 2013 at 8:56 PM, Martín Marqués mar...@2ndquadrant.com wrote: When ever I try to see the patch from this commit it never loads: https://commitfest.postgresql.org/action/patch_view?id=1129 Some problem there? I can see other patches, from other commits. Yes, the URL is wrong. right URL is http://www.postgresql.org/message-id/CAFjNrYuh=4Vwnv=2n7cj0jjuwc4hool1epxsoflj6s19u02...@mail.gmail.com Yes, found out that later. Could there be other URL's like that? -- Martín Marquéshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Robert Haas robertmh...@gmail.com writes: On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane t...@sss.pgh.pa.us wrote: The traditional theory has been that that would be less robust, not more so. Child backends are (mostly) able to carry out queries whether or not the postmaster is around. I think that's the Tom Lane theory. The Robert Haas theory is that if the postmaster has died, there's no reason to suppose that it hasn't corrupted shared memory on the way down, or that the system isn't otherwise heavily fuxxored in some way. Eh? The postmaster does its level best never to touch shared memory (after initialization anyway). True, you can't make new connections, but how does killing the existing children make that better? It allows you to start a new postmaster in a timely fashion, instead of waiting for an idle connection that may not ever terminate without operator intervention. There may be something in that argument, but I find the other one completely bogus. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On 06/22/2013 03:30 AM, ian link wrote: Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Consider RANGE 4.5 PRECEDING'. You need to be able to test whether, for the current row 'b', any given row 'a' is within the range (b - 4.5) a = b . Not 100% sure about the vs = boundaries, but that's irrelevant for the example. To test that, you have to be able to do two things: you have to be able to test whether one value is greater than another, and you have to be able to add or subtract a constant from one of the values. Right now, the b-tree access method provides information on the ordering operators = = = , which provides half the answer. But these don't give any concept of *distance* - you can test ordinality but not cardinality. To implement the different by 4.5 part, you have to be able to add 4.5 to one value or subtract it from the other. The obvious way to do that is to look up the function that implements the '+' or '-' operator, and do: ((OPERATOR(+))(a, 4.5)) b AND (a = b) or ((OPERATOR(-))(b, 4.5)) a AND (a = b); The problem outlined by Tom in prior discussion about this is that PostgreSQL tries really hard not to assume that particular operator names mean particular things. Rather than knowing that + is always an operator that adds two values together; is transitive, symmetric and reflexive, PostgreSQL requires that you define an *operator class* that names the operator that has those properties. Or at least, it does for less-than, less-than-or-equals, equals, greater-than-or-equals, greater-than, and not-equals as part of the b-tree operator class, which *usually* defines these operators as = = = , but you could use any operator names you wanted if you really liked. Right now (as far as I know) there's no operator class that lets you identify operators for addition and subtraction in a similar way. So it's necessary to either add such an operator class (in which case support has to be added for it for every type), extend the existing b-tree operator class to provide the info, or blindly assume that + and - are always addition and subtraction. For an example of why such assumptions are a bad idea, consider matrix multiplication. Normally, a * b = b * a, but this isn't true for multiplication of matrices. Similarly, if someone defined a + operator as an alias for string concatenation (||), we'd be totally wrong to assume we could use that for doing range-offset windowing. So. Yeah. Operator classes required, unless we're going to change the rules and make certain operator names special in PostgreSQL, so that if you implement them they *must* have certain properties. This seems like a pretty poor reason to add such a big change. I hope this explanation (a) is actually correct and (b) is helpful. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Fri, Jun 21, 2013 at 10:47 PM, Andres Freund and...@2ndquadrant.com wrote: Hm. Looking at how this is currently used - I am afraid it's not correct... the reason RelationGetIndexList() returns a copy is that cache invalidations will throw away that list. And you do index_open() while iterating over it which will accept invalidation messages. Mybe it's better to try using RelationGetIndexList directly and measure whether that has a measurable impact= By looking at the comments of RelationGetIndexList:relcache.c, actually the method of the patch is correct because in the event of a shared cache invalidation, rd_indexvalid is set to 0 when the index list is reset, so the index list would get recomputed even in the case of shared mem invalidation. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Friday, June 21, 2013 11:48 PM Robert Haas wrote: On Thu, Jun 20, 2013 at 12:18 AM, Amit Kapila amit.kap...@huawei.com wrote: Auto.conf- 1 Vote (Josh) System.auto.conf - 1 Vote (Josh) Postgresql.auto.conf - 2 Votes (Zoltan, Amit) Persistent.auto.conf - 0 Vote generated_by_server.conf - 1 Vote (Peter E) System.conf - 1 Vote (Magnus) alter_system.conf- 1 Vote (Alvaro) I had consolidated the list, so that it would be helpful for committer to decide among proposed names for this file. I'll also vote for postgresql.auto.conf. Thanks to all of you for suggesting meaningful names. I will change the name of file to postgresql.auto.conf. Kindly let me know if there is any objection to it. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Friday, June 21, 2013 11:43 PM Robert Haas wrote: On Fri, Jun 21, 2013 at 12:56 PM, Fujii Masao masao.fu...@gmail.com wrote: On Sat, Jun 22, 2013 at 12:11 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut pete...@gmx.net wrote: On 6/7/13 12:14 AM, Amit Kapila wrote: I will change the patch as per below syntax if there are no objections: ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'}; I do like using ALTER SYSTEM in general, but now that I think about it, the 9.3 feature to create global settings in pg_db_role_setting would also have been a candidate for exactly that same syntax if it had been available. In fact, if we do add ALTER SYSTEM, it might make sense to recast that feature into that syntax. It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE or something like that. It's only a small syntax change, so don't worry about it too much, but let's keep thinking about it. I think that anything we want to add should either go before SET or after the value, such as: ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf'; ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file'; ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET configuration_parameter = 'file'; I agree we shouldn't back ourselves into a syntactic corner. Maybe this idea has been already discussed, but why don't we just add the function like update_config_file(parameter, value)? We can commit the core of the patch with that function API first, and then we can discuss the syntax and add another API like ALTER SYSTEM later. We now have set_config() function to set the parameter, so adding the function for this feature should not be a surprise. If the name of the function is, for example, update_conf_file, most users would think that it's intuitive even if SIGHUP is not automatically sent immediately. We don't need to emit the message like 'This setting change will not be loaded until SIGHUP'. I could certainly support that plan. I'm satisfied with the ALTER SYSTEM syntax and feel that might find other plausible uses, but functional notation would be OK with me, too. I can update the patch to have a function as suggested by Fujii-san if it can be decided that for the first committed version it will be sufficient. OTOH already we already have consensus on ALTER SYSTEM syntax apart from few extra keywords to make it more meaningful/extendible. I think we can consider the current syntax (ALTER SYSTEM SET ..) and make that behavior as default for writing to auto file. In future we can extend it with other keywords depending on usage. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers