Re: [HACKERS] pg_dump --serializable-deferrable
Peter Eisentraut wrote: The name of this new option is a bit of a mouthful, and it mixes in an otherwise standardized term (deferrable, as in constraints) with transaction isolation. Wouldn't something like --wait-for-serializable be clearer (and shorter)? I see it's not mentioned in the description of the pg_dump option, but the option name is based on the new (PostgreSQL-specific) DEFERRABLE transaction property which is used when the option is specified. See if it makes more sense after reading this page: http://developer.postgresql.org/pgdocs/postgres/sql-set-transaction.html Personally, I think DEFERRABLE is a good name for the transaction property; it conveys the right semantics and avoids adding a new reserved word. The question of what to name it was first raised almost eight months ago, and it has been discussed many times since. http://search.postgresql.org/search?q=kevin+serializable+deferrablem=1l=1d=365s=d I'm less concerned about the pg_dump name, if you think something else is clearer; although this one isn't the longest pg_dump option we have. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Wed, May 11, 2011 at 12:54 AM, Greg Stark gsst...@mit.edu wrote: On a separate note though, Simon, I don't know what you mean by we normally start with a problem. It's an free software project and people are free to work on whatever interests them whether that's because it solves a problem they have, helps a client who's paying them, or just because it's of academic interest to them. We don't always take their patches if they aren't of general interest but people propose all kinds of crazy experimental ideas all the time. Completely agree, but why are you saying that to me? When Tom asks me why I suggest something, nobody tells him its a free software project etc What is the difference here? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature proposal: distinguish each PostgreSQL instance in the event log
On Tue, May 10, 2011 at 11:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: MauMau maumau...@gmail.com writes: MauMau maumau...@gmail.com writes: I've encountered one problem on Windows. I need to support running all of my products on one host simultaneously. Plus, I need to log messages in syslog/event log. On Linux, I can distinguish the messages of one product and those of other products by setting syslog_ident in postgresql.conf. On the other hand, I can't do that on Windows: all of the PostgreSQL instances use the same fixed event source name PostgreSQL. SO I'd like to propose a trivial feature which allows users to set event source. BTW, what will this accomplish exactly that couldn't be accomplished by setting log_line_prefix to include the desired identifier? Windows uses the event source field to show where events in the logs have come from. The Event Viewer and other network management tools expect to be able to use the field for sorting and filtering etc, so having a different value for different PG instances allows those tools to distinguish between them properly. It's also useful for ISVs who may want to make an embedded PG instance used by their apps identify itself differently from other instances. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign table permissions and cloning
(2011/04/26 5:42), Robert Haas wrote: OK. Turned out a little more cleanup was needed to make this all the way consistent with how we handle views; I have now done that. I noticed that some fixes would be needed for consistency about foreign table privileges. Attached patch includes fixes below: 1) psql doesn't complete FOREIGN TABLE after GRANT/REVOKE. 2) pg_dump uses GRANT .. ON TABLE for foreign tables, instead of ON FOREIGN TABLE. 3) GRANT document mentions that ALL TABLES includes foreign tables too. 4) Rows of information_schema.foreign_tables/foreign_table_options are visible to users who have any privileges on the foreign table. Regards, -- Shigeru Hanada diff --git a/doc/src/sgml/ref/grant.sgml b/doc/src/sgml/ref/grant.sgml index 93e8332..689aba5 100644 *** a/doc/src/sgml/ref/grant.sgml --- b/doc/src/sgml/ref/grant.sgml *** GRANT replaceable class=PARAMETERrol *** 101,107 There is also an option to grant privileges on all objects of the same type within one or more schemas. This functionality is currently supported only for tables, sequences, and functions (but note that literalALL !TABLES/ is considered to include views). /para para --- 101,107 There is also an option to grant privileges on all objects of the same type within one or more schemas. This functionality is currently supported only for tables, sequences, and functions (but note that literalALL !TABLES/ is considered to include views and foreign tables). /para para diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index c623fb7..452a0ea 100644 *** a/src/backend/catalog/information_schema.sql --- b/src/backend/catalog/information_schema.sql *** CREATE VIEW _pg_foreign_tables AS *** 2557,2564 WHERE w.oid = s.srvfdw AND u.oid = c.relowner AND (pg_has_role(c.relowner, 'USAGE') !OR has_table_privilege(c.oid, 'SELECT') !OR has_any_column_privilege(c.oid, 'SELECT')) AND n.oid = c.relnamespace AND c.oid = t.ftrelid AND c.relkind = 'f' --- 2557,2564 WHERE w.oid = s.srvfdw AND u.oid = c.relowner AND (pg_has_role(c.relowner, 'USAGE') !OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER') !OR has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE, REFERENCES')) AND n.oid = c.relnamespace AND c.oid = t.ftrelid AND c.relkind = 'f' diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 30366d2..e474a69 100644 *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *** dumpTable(Archive *fout, TableInfo *tbin *** 11804,11810 namecopy = strdup(fmtId(tbinfo-dobj.name)); dumpACL(fout, tbinfo-dobj.catId, tbinfo-dobj.dumpId, (tbinfo-relkind == RELKIND_SEQUENCE) ? SEQUENCE : - (tbinfo-relkind == RELKIND_FOREIGN_TABLE) ? FOREIGN TABLE : TABLE, namecopy, NULL, tbinfo-dobj.name, tbinfo-dobj.namespace-dobj.name, tbinfo-rolname, --- 11804,11809 diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 3ef2fa4..02f4aea 100644 *** a/src/bin/psql/tab-complete.c --- b/src/bin/psql/tab-complete.c *** psql_completion(char *text, int start, i *** 2234,2240 UNION SELECT 'DATABASE' UNION SELECT 'FOREIGN DATA WRAPPER' UNION SELECT 'FOREIGN SERVER' - UNION SELECT 'FOREIGN TABLE' UNION SELECT 'FUNCTION' UNION SELECT 'LANGUAGE' UNION SELECT 'LARGE OBJECT' --- 2234,2239 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Wed, May 11, 2011 at 1:47 AM, Bruce Momjian br...@momjian.us wrote: Greg Stark wrote: On a separate note though, Simon, I don't know what you mean by we normally start with a problem. It's an free software project and people are free to work on whatever interests them whether that's because it solves a problem they have, helps a client who's paying them, or just because it's of academic interest to them. We don't always take their patches if they aren't of general interest but people propose all kinds of crazy experimental ideas all the time. I am confused by Simon's questions too. Simon seems to regularly argue for adding features late in the development cycle and backpatch things no one else thinks should be backpatched, but he wants more research that index-only scans are going to improve things before it is implemented? The first is aggressive development, the second is very conservative development --- they don't match, so I now wonder what the motivation is since it isn't consistent. Not really sure why reasonable technical skepticism should become personal commentary. You don't question Tom's motives if he is skeptical of an idea of mine. Why would you question my motivation? What is *your* motive for acting like that? I'm not driven by one setting of conservatism, but I am interested in adding fully usable features that bring credit to the project. If I see a feature that can have minor things added to it to improve them, then I raise that during beta. If I see things being worked out that sounds dubious, I mention that in early development. I don't think this work will materially improve the speed of count(*) in majority of cases. This introduces extra overhead into the code path and that can be a net loss. The only time it will help is when you have a large table that is not cached and also not recently updated. Is count(*) run very often against such tables? Do we really care enough to optimise that use case with lots of special purpose code? The very fact that Kevin and yourself bring up different reasons for why we need this feature makes me nervous. The analysis has not been done yet, and all I have done is request that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Wed, May 11, 2011 at 2:34 AM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: So, what do we need in order to find our way to index-only scans? 1. The visibility map needs to be crash-safe. The basic idea of index-only scans is that, instead of checking the heap to find out whether each tuple is visible, we first check the visibility map. If the visibility map bit is set, then we know all tuples on the page are visible to all transactions, and therefore the tuple of interest is visible to our transaction. Assuming that a significant number of visibility map bits are set, this should enable us to avoid a fair amount of I/O, especially on large tables, because the visibility map is roughly 8000 times smaller than the heap, and therefore far more practical to keep in cache. However, before we can rely on the FYI, because the visibility map is only one _bit_ per page, it is 8000 * 8 or 64k times smaller than the heap, e.g. one 8k page covers 64MB of heap pages. This is important because we rely on this compactness in hope that the WAL logging of this information will not be burdensome. We would need to issue one WAL record per bit, not per page. I'm concerned about the path length added by VM visits and the potential contention that concentration of information will bring. Those aren't things to be dismissed without calculation and analysis. There might not be an issue there, but its worth checking. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] potential bug in trigger with boolean params
Hi, I was trying to create a trigger with parameters. I've found a potential bug when the param is boolean. Here is code replicating the bug: CREATE TABLE x(x TEXT); CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGER trig_x_text BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('text'); CREATE TRIGGER trig_x_int BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(10); CREATE TRIGGER trig_x_float BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(42.0); CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); ERROR: syntax error at or near true LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); I've already checked that on: 'PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 32-bit' 'PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit' If this is intended behavior, then the documentation doesn't say anything about that. The only information is that: TG_ARGV[] Data type array of text; the arguments from the CREATE TRIGGER statement. but the below line throws the same error: CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x( true::text ); while this obviously works: SELECT true::text; and this works as well: CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('true'); regards Szymon
Re: [HACKERS] Process wakeups when idle and power consumption
On 9 May 2011 11:19, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: In the child, spawn a thread How exactly should I go about this? The one place in the code that I knew to use multiple threads, pgbench, falls back on emulation with fork() on some platforms. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On 10.05.2011 20:15, Simon Riggs wrote: On Tue, May 10, 2011 at 5:17 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Simon Riggssi...@2ndquadrant.com wrote: This topic has been discussed many times, yet I have never seen an assessment that explains WHY we would want to do index-only scans. In databases with this feature, it's not too unusual for a query which uses just an index to run one or more orders of magnitude faster than a query which has to randomly access the heap for each index entry. That seems like enough evidence of its possible value in PostgreSQL to proceed to the point where benchmarks become possible. I'm assuming that, like all other features added as performance optimizations, it won't be committed until there are benchmarks showing the net benefit. As a thought experiment, picture the relative costs of scanning a portion of an index in index sequence, and being done, versus scanning a portion of an index in index sequence and jumping to a random heap access for each index entry as you go. I can picture that. Regrettably, I can also picture the accesses to the visibility map, the maintenance operations on the VM that are needed for this and the contention that both of those will cause. Note that we already have the visibility map, and the accesses needed to update it are already there. Granted, we'll have to change the logic slightly to make it crash safe, but I don't expect that to add any meaningful overhead - the changes are going to be where the bits are set, ie. vacuum, not when the bits are cleared. Granted, we might also want to set the bits more aggressively once they're used by index-only-scans. But done correctly, just taking advantage of the VM that's already there shouldn't add overhead to other operations. I agree that we need to do tests to demonstrate that there's a gain from the patch, once we have a patch to test. I would be very surprised if there isn't, but that just means the testing is going to be easy. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
On Wed, May 11, 2011 at 10:52, Peter Geoghegan pe...@2ndquadrant.com wrote: On 9 May 2011 11:19, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: In the child, spawn a thread How exactly should I go about this? The one place in the code that I knew to use multiple threads, pgbench, falls back on emulation with fork() on some platforms. If you're doing this Win32 specific, take a look at src/backend/port/win32/signal.c for an example. If you're not doing this win32-specific, I doubt we really want threads to be involved... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
Hi, I was trying to create a trigger with parameters. I've found a potential bug when the param is boolean. Here is code replicating the bug: CREATE TABLE x(x TEXT); CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGER trig_x_text BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('text'); CREATE TRIGGER trig_x_int BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(10); CREATE TRIGGER trig_x_float BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(42.0); CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); ERROR: syntax error at or near true LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On 11 May 2011 10:56, t...@fuzzy.cz wrote: Hi, I was trying to create a trigger with parameters. I've found a potential bug when the param is boolean. Here is code replicating the bug: CREATE TABLE x(x TEXT); CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGER trig_x_text BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('text'); CREATE TRIGGER trig_x_int BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(10); CREATE TRIGGER trig_x_float BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(42.0); CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); ERROR: syntax error at or near true LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html regards Tomas Well... no. In the link you've provided there is something different: Valid literal values for the true state are: TRUE't''true''y''yes''on''1' so I could use 'true'... and this doesn't work. And SQL is not case sensitive... but I will check it for you anyway: CREATE TRIGGER trig_x_2 BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(TRUE); ERROR: syntax error at or near TRUE LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(TRUE); regards Szymon
Re: [HACKERS] potential bug in trigger with boolean params
På onsdag 11. mai 2011 kl 10:56:19 skrev t...@fuzzy.cz: Hi, I was trying to create a trigger with parameters. I've found a potential bug when the param is boolean. Here is code replicating the bug: CREATE TABLE x(x TEXT); CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGER trig_x_text BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('text'); CREATE TRIGGER trig_x_int BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(10); CREATE TRIGGER trig_x_float BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(42.0); CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); ERROR: syntax error at or near true LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html What are you trying to accomplish? CREATE OR REPLACE FUNCTION trigger_x() does not declare any formal-parameters, so calling it with arguments doesn't make sense. I'm surprised creating the other triggers didn't produce an error stating No function defined with the name trigger_ix and the given argument-type. -- Andreas Joseph Krogh andr...@officenet.no Senior Software Developer / CTO Public key: http://home.officenet.no/~andreak/public_key.asc +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| Org.nr: NO 981 479 076 | | | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On Wednesday, May 11, 2011 11:01:56 AM Andreas Joseph Krogh wrote: På onsdag 11. mai 2011 kl 10:56:19 skrev t...@fuzzy.cz: CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html What are you trying to accomplish? CREATE OR REPLACE FUNCTION trigger_x() does not declare any formal-parameters, so calling it with arguments doesn't make sense. I'm surprised creating the other triggers didn't produce an error stating No function defined with the name trigger_ix and the given argument-type. Read the docs. Parameters for triggers are not passed as normal function parameters. Thats why you access them via via TG_ARGV in plpgsql. The grammar accepts only a very limited amount of parameters there: -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On 07.05.2011 16:48, Robert Haas wrote: I was able to reproduce something very like this in unpatched master, just by letting recovery pause at a named restore point, and then resuming it. LOG: recovery stopping at restore point stop, time 2011-05-07 09:28:01.652958-04 LOG: recovery has paused HINT: Execute pg_xlog_replay_resume() to continue. (at this point I did pg_xlog_replay_resume()) LOG: redo done at 0/520 PANIC: wal receiver still active LOG: startup process (PID 38762) was terminated by signal 6: Abort trap LOG: terminating any other active server processes I'm thinking that this code is wrong: if (recoveryPauseAtTarget standbyState == STANDBY_SNAPSHOT_READY) { SetRecoveryPause(true); recoveryPausesHere(); } reachedStopPoint = true;/* see below */ recoveryContinue = false; I think that recoveryContinue = false assignment should not happen if we decide to pause. That is, we should say if (recoveryPauseAtTarget standbyState == STANDBY_SNAPSHOT_READY) { same as now } else recoveryContinue = false. No, recovery stops at that point whether or not you pause. Resuming after stopping at the recovery target doesn't mean that you resume recovery, it means that you resume to end recovery and start up the server (see the 2nd to last paragraph at http://www.postgresql.org/docs/9.1/static/recovery-target-settings.html). It would probably be more useful to allow a new stopping target to be set and continue recovery, but the current pause/resume functions don't allow that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On Wednesday, May 11, 2011 11:21:34 AM Andres Freund wrote: On Wednesday, May 11, 2011 11:01:56 AM Andreas Joseph Krogh wrote: På onsdag 11. mai 2011 kl 10:56:19 skrev t...@fuzzy.cz: CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html What are you trying to accomplish? CREATE OR REPLACE FUNCTION trigger_x() does not declare any formal-parameters, so calling it with arguments doesn't make sense. I'm surprised creating the other triggers didn't produce an error stating No function defined with the name trigger_ix and the given argument-type. Read the docs. Parameters for triggers are not passed as normal function parameters. Thats why you access them via via TG_ARGV in plpgsql. The grammar accepts only a very limited amount of parameters there: Err TriggerFuncArg: Iconst { char buf[64]; snprintf(buf, sizeof(buf), %d, $1); $$ = makeString(pstrdup(buf)); } | FCONST { $$ = makeString($1); } | Sconst { $$ = makeString($1); } | BCONST { $$ = makeString($1); } | XCONST { $$ = makeString($1); } | ColId { $$ = makeString($1); } That is integers, floating point, strings, bitstrings, hexstrings and column references (???). How that exact list came to exist I do not know. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On 11 May 2011 11:01, Andreas Joseph Krogh andr...@officenet.no wrote: På onsdag 11. mai 2011 kl 10:56:19 skrev t...@fuzzy.cz: Hi, I was trying to create a trigger with parameters. I've found a potential bug when the param is boolean. Here is code replicating the bug: CREATE TABLE x(x TEXT); CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGER trig_x_text BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('text'); CREATE TRIGGER trig_x_int BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(10); CREATE TRIGGER trig_x_float BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(42.0); CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); ERROR: syntax error at or near true LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html What are you trying to accomplish? CREATE OR REPLACE FUNCTION trigger_x() does not declare any formal-parameters, so calling it with arguments doesn't make sense. I'm surprised creating the other triggers didn't produce an error stating No function defined with the name trigger_ix and the given argument-type. That's how you define trigger function. Later you can use params when defining trigger. regards Szymon
[HACKERS] Re: 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
On Mon, May 09, 2011 at 11:32:28PM -0400, Tom Lane wrote: To be concrete, consider the function array_append(anyarray, anyelement) yielding anyarray. Suppose we have a domain D over int[] and the call array_append(var_of_type_D, 42). If we automatically downcast the variable to int[], should the result of the function be considered to be of type D, or type int[]? This isn't a trivial distinction because choosing to consider it of type D means we have to re-check D's domain constraints, which might or might not be satisfied by the modified array. Previous releases considered the result to be of type D, *without* rechecking the domain constraints, which was flat out wrong. So we basically had three alternatives to make it better: * downcast to the array type, which would possibly silently break applications that were relying on the function result being considered of the domain type * re-apply domain checks on the function result, which would be a performance hit and possibly again result in unobvious breakage * explicitly break it by throwing a parse error until you downcast (and then upcast the function result if you want) I realize that #3 is a bit unpleasant, but are either of the other two better? At least #3 shows you where you need to check for problems. Though I've never used a domain over an array type, I'd strongly prefer #2. As best I can tell, the only functions that could need to do this are those that have an anyarray argument and also an anyarray return type. In core, that's array_append, array_prepend, array_cat, array_larger, array_smaller, max, and min. The last four always return an input unchanged, so only the first three would actually check anything. I'm not seeing, offhand, any need to add new validation to PLs. The parse-time breakage of #3 is a nice tool during your upgrade QA, but the long-term semantics are considerably worse for wear. I see nothing to recommend #1. There is another issue that wasn't really mentioned in the previous thread, which is that if we are matching a domain-over-array to a function's ANYARRAY argument, what exactly should be allowed to match to ANYELEMENT --- or if the function returns ANYELEMENT, what should the imputed result type be? What else but the type seen by subscripting a datum of the ANYARRAY type? AFAICS it's impossible to give an answer to that without effectively deciding that function argument matching smashes the domain to its base type (the array type). The domain and its base type share an element type, but how does that necessitate any further removal of domain-ness? It's not very clear what's the point of a domain type if every operation on it is going to neglect its domain-ness. Agreed. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
2011/5/11, Bruce Momjian br...@momjian.us: FYI, because the visibility map is only one _bit_ per page, it is 8000 * 8 or 64k times smaller than the heap, e.g. one 8k page covers 64MB of heap pages. Actually, that would be one 8kB block covers 512MB of heap: 1 block of visibility map (8kB) = 64k visibility bits = covers 64k blocks = covers 512MB of heap. The cost of keeping the visibility map in cache is therefore totally negligible, only the cost of WAL logging changes to it is of interest. This is important because we rely on this compactness in hope that the WAL logging of this information will not be burdensome. The size of on entry in the map (1 bit) is not very related to the WAL overhead required per change of such a bit (i.e., the log record for a 1 bit change will certainly be way more than 1 bit). Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
P onsdag 11. mai 2011 kl 11:30:51 skrev Szymon Guz mabew...@gmail.com: On 11 May 2011 11:01, Andreas Joseph Krogh andr...@officenet.no wrote: P onsdag 11. mai 2011 kl 10:56:19 skrev t...@fuzzy.cz: Hi, I was trying to create a trigger with parameters. I've found a potential bug when the param is boolean. Here is code replicating the bug: CREATE TABLE x(x TEXT); CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGER trig_x_text BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('text'); CREATE TRIGGER trig_x_int BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(10); CREATE TRIGGER trig_x_float BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(42.0); CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); ERROR: syntax error at or near true LINE 1: ... INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.html What are you trying to accomplish? CREATE OR REPLACE FUNCTION trigger_x() does not declare any formal-parameters, so calling it with arguments doesn't make sense. I'm surprised creating the other triggers didn't produce an error stating No function defined with the name trigger_ix and the given argument-type. That's how you define trigger function. Later you can use params when defining trigger. Pardon my ignorance:-) -- Andreas Joseph Krogh andr...@officenet.no Senior Software Developer / CTO Public key: http://home.officenet.no/~andreak/public_key.asc +-+ OfficeNet AS | The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch| 1414 Trollsen | somebody else doing it wrong, without | NORWAY | comment. | Org.nr: NO 981 479 076 | | | | Tlf: +47 24 15 38 90 | | Fax: +47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+
Re: [HACKERS] time-delayed standbys
On 11.05.2011 08:29, Fujii Masao wrote: On Sat, May 7, 2011 at 10:48 PM, Robert Haasrobertmh...@gmail.com wrote: I was able to reproduce something very like this in unpatched master, just by letting recovery pause at a named restore point, and then resuming it. I was able to reproduce the same problem even in 9.0. When the standby reaches the recovery target, it always tries to end the recovery even though walreceiver is still running, which causes the problem. This seems to be an oversight in streaming replication. I should have considered how the standby should work when recovery_target is specified. What about the attached patch? Which stops walreceiver instead of emitting PANIC there only if we've reached the recovery target. I think we can just always call ShutdownWalRcv(). It should be gone if the server was promoted while streaming, but that's just an implementation detail of what the promotion code does. There's no hard reason why it shouldn't be running at that point anymore, as long as we kill it before going any further. Committed a patch to do that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
2011/5/11 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: On 10.05.2011 20:15, Simon Riggs wrote: On Tue, May 10, 2011 at 5:17 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Simon Riggssi...@2ndquadrant.com wrote: This topic has been discussed many times, yet I have never seen an assessment that explains WHY we would want to do index-only scans. In databases with this feature, it's not too unusual for a query which uses just an index to run one or more orders of magnitude faster than a query which has to randomly access the heap for each index entry. That seems like enough evidence of its possible value in PostgreSQL to proceed to the point where benchmarks become possible. I'm assuming that, like all other features added as performance optimizations, it won't be committed until there are benchmarks showing the net benefit. As a thought experiment, picture the relative costs of scanning a portion of an index in index sequence, and being done, versus scanning a portion of an index in index sequence and jumping to a random heap access for each index entry as you go. I can picture that. Regrettably, I can also picture the accesses to the visibility map, the maintenance operations on the VM that are needed for this and the contention that both of those will cause. Note that we already have the visibility map, and the accesses needed to update it are already there. Granted, we'll have to change the logic slightly to make it crash safe, but I don't expect that to add any meaningful overhead - the changes are going to be where the bits are set, ie. vacuum, not when the bits are cleared. Granted, we might also want to set the bits more aggressively once they're used by index-only-scans. But done correctly, just taking advantage of the VM that's already there shouldn't add overhead to other operations. We won't be able to do index-only scan. We can do index scan with probability to not access heap, maybe(hopefully) completely in some cases. IF vis map is ok to remove the need to access heap (perf and safe), then, for the cost part: Currently, cost_index materialize the cost to access each heap page by a random_page_cost. I believe we should be able to change that to remove the estimated number of heap page we don't need to access (can be 100% or 0.1%). And as suggested Simon, there is also maybe a path to improve the bitmapheap scan. bitmapheap scan have already some workaround to be sure indexscan looks cheaper in some case, just keep that and apply same logic than for cost_index. This is keeping the same rule PostgreSQL has : let the planner decide the best solution instead of allowing special index declaration (it hasn't been propose yet I think, but, well, just in case it pops into the mind of someone) I agree that we need to do tests to demonstrate that there's a gain from the patch, once we have a patch to test. I would be very surprised if there isn't, but that just means the testing is going to be easy. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On Wednesday, May 11, 2011 11:50:35 AM Szymon Guz wrote: On 11 May 2011 11:29, Andres Freund and...@anarazel.de wrote: On Wednesday, May 11, 2011 11:21:34 AM Andres Freund wrote: On Wednesday, May 11, 2011 11:01:56 AM Andreas Joseph Krogh wrote: På onsdag 11. mai 2011 kl 10:56:19 skrev t...@fuzzy.cz: CREATE TRIGGER trig_x_bool BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(true); The docs clearly state what the valid values are and the literal 'true' is not one of them (TRUE is). See this: http://www.postgresql.org/docs/9.0/interactive/datatype-boolean.htm l What are you trying to accomplish? CREATE OR REPLACE FUNCTION trigger_x() does not declare any formal-parameters, so calling it with arguments doesn't make sense. I'm surprised creating the other triggers didn't produce an error stating No function defined with the name trigger_ix and the given argument-type. Read the docs. Parameters for triggers are not passed as normal function parameters. Thats why you access them via via TG_ARGV in plpgsql. The grammar accepts only a very limited amount of parameters there: Err TriggerFuncArg: Iconst { char buf[64]; snprintf(buf, sizeof(buf), %d, $1); $$ = makeString(pstrdup(buf)); } | | FCONST { $$ = makeString($1); } | Sconst { $$ = makeString($1); } | BCONST { $$ = makeString($1); } | XCONST { $$ = makeString($1); } | ColId { $$ = makeString($1); } That is integers, floating point, strings, bitstrings, hexstrings and column references (???). How that exact list came to exist I do not know. My two thoughts on that: 1. This list should be improved to allow booleans, and maybe other types 2. Why then is it like this: it works: CREATE TRIGGER trig_x_10 BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('true'); it does not: CREATE TRIGGER trig_x_11 BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('true'::text); the error is: ERROR: syntax error at or near :: I think there is something wrong. The grammar doesn't allow any form of expression. It only allows the above listed types of literals directly. I am not really sure why it was done that way, but its been that way for a long time (only insignificant changes since 1997... bitstrings and hex strings were added after that though). Why do you wan't to use a boolean directly if you can't use it as the type itself anyway? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
2011/5/10 Kevin Grittner kevin.gritt...@wicourts.gov: Simon Riggs si...@2ndquadrant.com wrote: The typical speed up for non-covered indexes will come when we access a very large table (not in cache) via an index scan that is smaller than a bitmapindex scan. Will we be able to gauge selectivities sufficiently accurately to be able to pinpoint that during optimization? How will we know that the table is not in cache? Or is this an optimisation in the executor for a bitmapheap scan? I would continue to object to using current cache contents for plan choice because of plan instability and the fact that an odd initial cache load could skew plans in a bad direction indefinitely. I do agree (and have already posted) that I think the hardest part of this might be developing a good cost model. I doubt that's an insoluble problem, especially since it is something we can refine over time as we gain experience with the edge cases. you will have the same possible instability in planning with the index(-only?) scan because we may need to access heap anyway and this needs is based on estimation, or I miss something ? I understood the idea was just to bypass the heap access *if* we can for *this* heap-page. In reality, I am not really scared by plan instability because of a possible PG/OS cache estimation. The percentages remain stable in my observations ... I don't know yet how it will go for vis map. And, we already have plan instability currently, which is *good* : at some point a seq scan is better than an bitmap heap scan. Because the relation size change and because ANALYZE re-estimate the distribution of the data. I will be very happy to issue ANALYZE CACHE as I have to ANALYZE temp table for large query if it allows the planner to provide me the best plan in some scenario...but this is another topic, sorry for the digression.. -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On 11 May 2011 12:06, Andres Freund and...@anarazel.de wrote: Why do you wan't to use a boolean directly if you can't use it as the type itself anyway? Yep, and this is a really good point :) I wanted to have consistent api, so use true when I have a boolean value. I will use 'true' and add some info on that to the procedure documentation. regards Szymon
Re: [HACKERS] Process wakeups when idle and power consumption
On 11 May 2011 09:54, Magnus Hagander mag...@hagander.net wrote: If you're doing this Win32 specific, take a look at src/backend/port/win32/signal.c for an example. If you're not doing this win32-specific, I doubt we really want threads to be involved... Well, that seems to be the traditional wisdom. It seems sensible to me that each process should look out for postmaster death itself though. Tom described potential race conditions in looking at ps output...do we really want to double the number of auxiliary processes in a single release of Postgres? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
On 11.05.2011 13:34, Peter Geoghegan wrote: On 11 May 2011 09:54, Magnus Hagandermag...@hagander.net wrote: If you're doing this Win32 specific, take a look at src/backend/port/win32/signal.c for an example. If you're not doing this win32-specific, I doubt we really want threads to be involved... Well, that seems to be the traditional wisdom. It seems sensible to me that each process should look out for postmaster death itself though. Tom described potential race conditions in looking at ps output...do we really want to double the number of auxiliary processes in a single release of Postgres? Uh, no you don't want any new processes on Unix. You want each process to check for postmaster death every once in a while, like they do today. The pipe-trick is to make sure the processes wake up promptly to notice the death when the postmaster dies. You just need to add the postmaster-pipe to the select() calls we already do. I'm not sure if on Windows you can similarly just add to the postmaster-pipe to the WaitForMultipleObjects() calls we already do. Then you won't need new threads on Windows either. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Tue, May 10, 2011 at 9:34 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: So, what do we need in order to find our way to index-only scans? 1. The visibility map needs to be crash-safe. The basic idea of index-only scans is that, instead of checking the heap to find out whether each tuple is visible, we first check the visibility map. If the visibility map bit is set, then we know all tuples on the page are visible to all transactions, and therefore the tuple of interest is visible to our transaction. Assuming that a significant number of visibility map bits are set, this should enable us to avoid a fair amount of I/O, especially on large tables, because the visibility map is roughly 8000 times smaller than the heap, and therefore far more practical to keep in cache. However, before we can rely on the FYI, because the visibility map is only one _bit_ per page, it is 8000 * 8 or 64k times smaller than the heap, e.g. one 8k page covers 64MB of heap pages. This is important because we rely on this compactness in hope that the WAL logging of this information will not be burdensome. I accuse you of bad math. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Tue, May 10, 2011 at 10:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: That will be true only if you intentionally ignore the points Greg raised. If the table isn't entirely ALL_VISIBLE, then the choice of index will determine the ordering of the actual table probes that occur. There could be more or fewer page reads, in a more or less optimal order, depending on the index used. However, note that this wasn't one of the cases I said I was going to try to optimize in the first go-around anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Wed, May 11, 2011 at 6:50 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I think we can just always call ShutdownWalRcv(). It should be gone if the server was promoted while streaming, but that's just an implementation detail of what the promotion code does. There's no hard reason why it shouldn't be running at that point anymore, as long as we kill it before going any further. Okay. But I'd like to add the following assertion check just before ShutdownWalRcv() which you added, in order to detect such a bug that we found this time, i.e., the bug which causes unexpected end of recovery. Thought? Assert(reachedStopPoint || !WalRcvInProgress()) Committed a patch to do that. Thanks. Should we backport it to 9.0? 9.0 has the same problem. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Wed, May 11, 2011 at 3:17 AM, Simon Riggs si...@2ndquadrant.com wrote: Completely agree, but why are you saying that to me? When Tom asks me why I suggest something, nobody tells him its a free software project etc What is the difference here? We're now 40 emails in this thread, and there seems to be far more heat than light here. Here's an attempt at a summary: - Simon wants proof that the performance benefit of this feature is worth any downsides it may have, which is standard procedure, and isn't certain the feature will have a significant performance benefit. - Numerous other people think Simon's doubts about the feature are poorly justified (and some of them also think he's being a pain in the neck). - Various peripherally related topics, such as optimizing count(*), which is not part of the vision for the first go-round that I sketched out in my OP, and plan stability, which is another issue entirely, have been discussed. - Meanwhile, only one person has done any review of the actual code that's been written, which is posted on the crash-safe visibility map thread, which may be why multiple people seem confused about what it does. - And no one has done any benchmarking of that code. I think it would be really helpful if some more people would review the crash-safe visibility map patch, and if at least one person could benchmark it. It would be useful to know (a) whether that noticeably slows down the system during inserts, updates, and deletes, especially at very high concurrency; and (b) how much of an impact the additional WAL-logging has on VACUUM. On the other hand, arguing about whether index-only scans are going to result in a significant performance benefit is not useful. I am going to be both surprised and disappointed if they don't, but there's only one way to find out, and a theoretical argument isn't it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature proposal: distinguish each PostgreSQL instance in the event log
From: Dave Page dp...@pgadmin.org On Tue, May 10, 2011 at 11:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: BTW, what will this accomplish exactly that couldn't be accomplished by setting log_line_prefix to include the desired identifier? Windows uses the event source field to show where events in the logs have come from. The Event Viewer and other network management tools expect to be able to use the field for sorting and filtering etc, so having a different value for different PG instances allows those tools to distinguish between them properly. It's also useful for ISVs who may want to make an embedded PG instance used by their apps identify itself differently from other instances. Thank you for help, Dave. The Windows Event Viewer can filter event log records with event source, computer, user, date/time, etc. but not with message body. So log_line_prefix cannot be used for filtering. Even if filtering with message body were possible, using event source field to filter applications is more natural and intuitive. I would appreciate comments from all on the following items mentioned previously. If no comments, I'll do as follows and try to submit the patch a week later or so if I have enough time. 1. Documentation Move the event log registration/deregistion article to the following section. Add a link to this article from log_destination description. 16.1.4. Cleaning and installing http://www.postgresql.org/docs/9.0/static/install-windows-full.html 2. pg_ctl's event logging Do not change anything - that is, use the fixed event source PostgreSQL. How do we explain the reason for not using event_source parameter in postgresql.conf? postgresql.conf is for the server. pg_ctl uses the fixed event source. This is a specification. This is not a problem because starting/stopping the database/application is infrequent once the system is in steady operation. -- this may sound abrupt, though. Regards, MauMau - Original Message - From: Dave Page dp...@pgadmin.org To: Tom Lane t...@sss.pgh.pa.us Cc: MauMau maumau...@gmail.com; pgsql-hackers@postgresql.org Sent: Wednesday, May 11, 2011 4:58 PM Subject: Re: [HACKERS] Feature proposal: distinguish each PostgreSQL instance in the event log On Tue, May 10, 2011 at 11:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: MauMau maumau...@gmail.com writes: MauMau maumau...@gmail.com writes: I've encountered one problem on Windows. I need to support running all of my products on one host simultaneously. Plus, I need to log messages in syslog/event log. On Linux, I can distinguish the messages of one product and those of other products by setting syslog_ident in postgresql.conf. On the other hand, I can't do that on Windows: all of the PostgreSQL instances use the same fixed event source name PostgreSQL. SO I'd like to propose a trivial feature which allows users to set event source. BTW, what will this accomplish exactly that couldn't be accomplished by setting log_line_prefix to include the desired identifier? Windows uses the event source field to show where events in the logs have come from. The Event Viewer and other network management tools expect to be able to use the field for sorting and filtering etc, so having a different value for different PG instances allows those tools to distinguish between them properly. It's also useful for ISVs who may want to make an embedded PG instance used by their apps identify itself differently from other instances. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On 11.05.2011 14:16, Fujii Masao wrote: On Wed, May 11, 2011 at 6:50 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I think we can just always call ShutdownWalRcv(). It should be gone if the server was promoted while streaming, but that's just an implementation detail of what the promotion code does. There's no hard reason why it shouldn't be running at that point anymore, as long as we kill it before going any further. Okay. But I'd like to add the following assertion check just before ShutdownWalRcv() which you added, in order to detect such a bug that we found this time, i.e., the bug which causes unexpected end of recovery. Thought? Assert(reachedStopPoint || !WalRcvInProgress()) There's no unexpected end of recovery here. The recovery ends when we reach the target, as it should. It was the assumption that WAL receiver can't be running at that point anymore that was wrong. That assertion would work, AFAICS, but I don't think it's something we need to assert. There isn't any harm done if WAL receiver is still running, as long as we shut it down at that point. Committed a patch to do that. Thanks. Should we backport it to 9.0? 9.0 has the same problem. Ah, thanks, missed that, Cherry-picked to 9.0 now as well. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VARIANT / ANYTYPE datatype
On Tue, May 10, 2011 at 10:29 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: It seems to me a reasonable way to implement VARIANT would be to have a data type called VARIANT that stores an OID of the inner type at the beginning, followed by the binary data. That's likely to be how it gets implemented, but you seem to have missed the point of some of the discussion upthread: the big problem with that is that someone might type DROP TYPE foo, and when they do, you need an efficient way to figure out whether foo is in use inside an instance of the variant type anywhere in the system. The devil is in the details... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch to allow domains over composite types
typecmds.c says: Domains over composite types might be made to work in the future, but not today. Attached is a patch that allows domains over composite types, together with test cases in domaincomp.sql. A domain over a composite type has typtype TYPTYPE_DOMAIN, but typrelid and typrelkind are empty: that information is only available in the pg_type record of the base type. The remainder of the patch follows from that choice. While parsing a record expression into a row type, an extra coercion node had to be inserted to ensure that the domain checks are called. All regression tests are ok, comments are highly appreciated. -- Yeb Havinga http://www.mgrid.net/ Mastering Medical Data diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c new file mode 100644 index 4f4dd69..1c8dc64 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *** *** 36,41 --- 36,42 #include access/xact.h #include catalog/catalog.h #include utils/fmgroids.h + #include utils/lsyscache.h #include utils/pg_lzcompress.h #include utils/rel.h #include utils/typcache.h *** toast_flatten_tuple_attribute(Datum valu *** 965,971 * Break down the tuple into fields. */ olddata = DatumGetHeapTupleHeader(value); ! Assert(typeId == HeapTupleHeaderGetTypeId(olddata)); Assert(typeMod == HeapTupleHeaderGetTypMod(olddata)); /* Build a temporary HeapTuple control structure */ tmptup.t_len = HeapTupleHeaderGetDatumLength(olddata); --- 966,973 * Break down the tuple into fields. */ olddata = DatumGetHeapTupleHeader(value); ! Assert((typeId == HeapTupleHeaderGetTypeId(olddata) ! || (getBaseType(typeId) == HeapTupleHeaderGetTypeId(olddata; Assert(typeMod == HeapTupleHeaderGetTypMod(olddata)); /* Build a temporary HeapTuple control structure */ tmptup.t_len = HeapTupleHeaderGetDatumLength(olddata); diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c new file mode 100644 index f8eb5bc..c9aef53 *** a/src/backend/commands/typecmds.c --- b/src/backend/commands/typecmds.c *** DefineDomain(CreateDomainStmt *stmt) *** 814,827 basetypeoid = HeapTupleGetOid(typeTup); /* ! * Base type must be a plain base type, another domain or an enum. Domains ! * over pseudotypes would create a security hole. Domains over composite ! * types might be made to work in the future, but not today. */ typtype = baseType-typtype; if (typtype != TYPTYPE_BASE typtype != TYPTYPE_DOMAIN ! typtype != TYPTYPE_ENUM) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg(\%s\ is not a valid base type for a domain, --- 814,827 basetypeoid = HeapTupleGetOid(typeTup); /* ! * Base type must be a plain base type, composite type, another domain or ! * an enum. Domains over pseudotypes would create a security hole. */ typtype = baseType-typtype; if (typtype != TYPTYPE_BASE typtype != TYPTYPE_DOMAIN ! typtype != TYPTYPE_ENUM ! typtype != TYPTYPE_COMPOSITE) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg(\%s\ is not a valid base type for a domain, diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c new file mode 100644 index 0418972..39a1835 *** a/src/backend/parser/parse_coerce.c --- b/src/backend/parser/parse_coerce.c *** coerce_type(ParseState *pstate, Node *no *** 369,382 return result; } if (inputTypeId == RECORDOID ! ISCOMPLEX(targetTypeId)) { /* Coerce a RECORD to a specific complex type */ return coerce_record_to_complex(pstate, node, targetTypeId, ccontext, cformat, location); } if (targetTypeId == RECORDOID ! ISCOMPLEX(inputTypeId)) { /* Coerce a specific complex type to RECORD */ /* NB: we do NOT want a RelabelType here */ --- 369,382 return result; } if (inputTypeId == RECORDOID ! isComplex(targetTypeId)) { /* Coerce a RECORD to a specific complex type */ return coerce_record_to_complex(pstate, node, targetTypeId, ccontext, cformat, location); } if (targetTypeId == RECORDOID ! isComplex(inputTypeId)) { /* Coerce a specific complex type to RECORD */ /* NB: we do NOT want a RelabelType here */ *** can_coerce_type(int nargs, Oid *input_ty *** 478,491 * coerce (may need tighter checking here) */ if (inputTypeId == RECORDOID ! ISCOMPLEX(targetTypeId)) continue; /* * If input is a composite type and target is RECORD, accept */ if (targetTypeId == RECORDOID ! ISCOMPLEX(inputTypeId)) continue; #ifdef NOT_USED /* not implemented yet */ --- 478,491 * coerce (may need tighter checking here) */ if (inputTypeId == RECORDOID ! isComplex(targetTypeId)) continue; /* * If input is a
Re: [HACKERS] crash-safe visibility map, take five
On Tue, May 10, 2011 at 7:38 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, May 10, 2011 at 9:45 AM, Merlin Moncure mmonc...@gmail.com wrote: I see: here's a comment that was throwing me off: + /* +* If we didn't get the lock and it turns out we need it, we'll have to +* unlock and re-lock, to avoid holding the buffer lock across an I/O. +* That's a bit unfortunate, but hopefully shouldn't happen often. +*/ I think that might be phrased as didn't get the pin and it turns out we need it because the bit can change after inspection. The visible bit isn't 'wrong' as suggested in the comments, it just can change so that it becomes wrong. Maybe a note of why it could change would be helpful. Oh, I see. I did write lock when I meant pin, and your other point is well-taken as well. Here's a revised version with some additional wordsmithing. Some trivial comments. Why do the set and clear functions need pass-by-reference (Buffer *) argument ? I don't see them modifying the argument at all. This patch adds the clear function, but the existing set function also suffers from that. There are several invocations of pin/clear/release combos. May be you would want a convenience routine for doing that in a single step or just pass InvalidBuffer to clear() in which case, it would assume that the vm buffer is not pinned and do the needful. The comment at the top of visibilitymap_pin_ok says On entry, *buf, but the function really takes just a buf. You can possibly fold visibilitymap_pin_ok() into a macro (and also name it slightly differently like visibilitymap_is_pinned ?). Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] VARIANT / ANYTYPE datatype
On 05/11/2011 07:53 AM, Robert Haas wrote: On Tue, May 10, 2011 at 10:29 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: It seems to me a reasonable way to implement VARIANT would be to have a data type called VARIANT that stores an OID of the inner type at the beginning, followed by the binary data. That's likely to be how it gets implemented, but you seem to have missed the point of some of the discussion upthread: the big problem with that is that someone might type DROP TYPE foo, and when they do, you need an efficient way to figure out whether foo is in use inside an instance of the variant type anywhere in the system. The devil is in the details... Actually, I thought that was the variant (pun intended) Tom objected to, and wanted declared variant types instead, precisely so we could know where we'd need to look if a type was dropped. Using that you might still use the type oid at the beginning of the field, but there are other possibilities too (e.g. a single byte indicating which known variant is in use). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
Nicolas Barbier wrote: 2011/5/11, Bruce Momjian br...@momjian.us: FYI, because the visibility map is only one _bit_ per page, it is 8000 * 8 or 64k times smaller than the heap, e.g. one 8k page covers 64MB of heap pages. Actually, that would be one 8kB block covers 512MB of heap: 1 block of visibility map (8kB) = 64k visibility bits = covers 64k blocks = covers 512MB of heap. The cost of keeping the visibility map in cache is therefore totally negligible, only the cost of WAL logging changes to it is of interest. Ah, yes, thanks, even better. This is important because we rely on this compactness in hope that the WAL logging of this information will not be burdensome. The size of on entry in the map (1 bit) is not very related to the WAL overhead required per change of such a bit (i.e., the log record for a 1 bit change will certainly be way more than 1 bit). True. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
Simon Riggs wrote: On Wed, May 11, 2011 at 1:47 AM, Bruce Momjian br...@momjian.us wrote: Greg Stark wrote: On a separate note though, Simon, I don't know what you mean by we normally start with a problem. It's an free software project and people are free to work on whatever interests them whether that's because it solves a problem they have, helps a client who's paying them, or just because it's of academic interest to them. We don't always take their patches if they aren't of general interest but people propose all kinds of crazy experimental ideas all the time. I am confused by Simon's questions too. Simon seems to regularly argue for adding features late in the development cycle and backpatch things no one else thinks should be backpatched, but he wants more research that index-only scans are going to improve things before it is implemented? ? The first is aggressive development, the second is very conservative development --- they don't match, so I now wonder what the motivation is since it isn't consistent. Not really sure why reasonable technical skepticism should become personal commentary. You don't question Tom's motives if he is skeptical of an idea of mine. Why would you question my motivation? What is *your* motive for acting like that? Tom is consistent in his level of aggressive/conservative development suggestions. What I am seeing are many cases where you are consistently pushing for something even though you get almost-overwhelming rejection, and you keep going. And if it was consistent in one direction, I could understand because maybe you feel we are too conservative, but if it isn't consistent, I have no idea how to learn or adjust to your approach. We clearly have some people on one side of the conservative/agressive specturm, and some on the other side. Now, I am willing to admit I might be totally wrong, but it has risen to a level that I felt I should say something in case it is helpful. I'm not driven by one setting of conservatism, but I am interested in adding fully usable features that bring credit to the project. If I see a feature that can have minor things added to it to improve them, then I raise that during beta. If I see things being worked out that sounds dubious, I mention that in early development. Yes, that seems fine to me, as stated. I don't think this work will materially improve the speed of count(*) in majority of cases. This introduces extra overhead into the code I think this is the only hope we have of improving count(*) in an active MVCC system. It might not work, but it has been our only hope of improvement of count(*) for a while. path and that can be a net loss. The only time it will help is when you have a large table that is not cached and also not recently updated. Is count(*) run very often against such tables? Do we really care enough to optimise that use case with lots of special purpose code? The very fact that Kevin and yourself bring up different reasons for why we need this feature makes me nervous. Yes, no question. For count(*), you don't care about the indexed values, only the count, while for Kevin's case you are reading values from the index. I assume (or hope) that one or both will be a win for this feature. The analysis has not been done yet, and all I have done is request that. I think we are going to have to write the code and see the performance hit and where it is a win. Ideally we could figure this out before-hand, but I don't think that is possible in this case. If you look at the research in reducing the load of updating the hint bits, again, it is so complex that only working code and testing is showing if there is possible improvement there. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
Noah Misch n...@leadboat.com writes: On Mon, May 09, 2011 at 11:32:28PM -0400, Tom Lane wrote: So we basically had three alternatives to make it better: * downcast to the array type, which would possibly silently break applications that were relying on the function result being considered of the domain type * re-apply domain checks on the function result, which would be a performance hit and possibly again result in unobvious breakage * explicitly break it by throwing a parse error until you downcast (and then upcast the function result if you want) I realize that #3 is a bit unpleasant, but are either of the other two better? At least #3 shows you where you need to check for problems. Though I've never used a domain over an array type, I'd strongly prefer #2. Hmm. I hadn't seriously considered that alternative, but we could go in that direction. Logically, what this would probably imply is inserting CastToDomain whenever the result of a polymorphic function is deemed to be of a domain type, whether the base type is array or not. The reason I hadn't taken it very seriously is that I don't think it's actually going to end up being consistent. If we don't do #1 (downcast polymorphic arguments to a base type), but consider the arguments passed to the function to be of the domain type, then really we have to expect the polymorphic function to enforce domain constraints internally; we cannot fix it with something as localized as having the function call parser stick a CastToDomain on top. Here's a possibly rather silly example: create function negate(anyelement) returns anyelement as $$ select - $1 $$ language sql; create domain pos as int check (value 0); select negate(42::pos); This negate() function will work for any type that has a unary minus operator. But the result of the unary minus operation cannot sanely be considered to be of this domain type. In this simplified example you might feel it doesn't matter, since with an external CastToDomain we'd throw error anyway a moment later, as soon as control comes back from the function. But what if the function does further operations with the value, such as passing it to another polymorphic function? So really, if you go down this path, you end up concluding that PLs supporting polymorphic arguments had better be prepared to enforce domain constraints all the way through, and thus there should be no need for an external CastToDomain --- what comes back from the function ought to be checked already. Unfortunately, even if the PLs do that (SQL functions might get it right, but I'm not real sure whether plpgsql is water-tight on this, and I don't trust the other PLs for it at all), there's no way that built-in polymorphic functions like array_append are going to. So on the whole, #2 looks like an implementation quagmire to me: it's not clear what to check, or where, or how you know when you're done. I'm not willing to volunteer my own time to make it work that way. If somebody else who uses domains a lot wants to step up and take responsibility, go for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
C?dric Villemain wrote: 2011/5/10 Kevin Grittner kevin.gritt...@wicourts.gov: Simon Riggs si...@2ndquadrant.com wrote: The typical speed up for non-covered indexes will come when we access a very large table (not in cache) via an index scan that is smaller than a bitmapindex scan. Will we be able to gauge selectivities sufficiently accurately to be able to pinpoint that during optimization? How will we know that the table is not in cache? Or is this an optimisation in the executor for a bitmapheap scan? I would continue to object to using current cache contents for plan choice because of plan instability and the fact that an odd initial cache load could skew plans in a bad direction indefinitely. ?I do agree (and have already posted) that I think the hardest part of this might be developing a good cost model. ?I doubt that's an insoluble problem, especially since it is something we can refine over time as we gain experience with the edge cases. you will have the same possible instability in planning with the index(-only?) scan because we may need to access heap anyway and this needs is based on estimation, or I miss something ? I understood the idea was just to bypass the heap access *if* we can for *this* heap-page. In reality, I am not really scared by plan instability because of a possible PG/OS cache estimation. The percentages remain stable in my observations ... I don't know yet how it will go for vis map. And, we already have plan instability currently, which is *good* : at some point a seq scan is better than an bitmap heap scan. Because the relation size change and because ANALYZE re-estimate the distribution of the data. I will be very happy to issue ANALYZE CACHE as I have to ANALYZE temp table for large query if it allows the planner to provide me the best plan in some scenario...but this is another topic, sorry for the digression.. Good point --- we would be making plan decisions based on the visibility map coverage. The big question is whether visibility map changes are more dynamic than the values we already plan against, like rows in the table, table size, and value distributions. I don't know the answer. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PGC_S_DEFAULT is inadequate
On Wed, May 11, 2011 at 3:20 AM, Tom Lane t...@sss.pgh.pa.us wrote: So this is fine if the current value was from the file or was the boot_val, but if we'd overridden the boot value with a replacement default value using PGC_S_DEFAULT, that code would cause the value to revert to the boot_val not the replacement value. Not desirable. Doesn't this mean if you had a setting of timezone in your config file and remove it and reload you'll get a different value than you would have if you had actually booted without the line originally? My conclusion about all this is that we really need to invent another GucSource value falling between PGC_S_DEFAULT and PGC_S_ENV_VAR, called perhaps PGC_S_DYNAMIC_DEFAULT LGTM -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
Bruce Momjian br...@momjian.us wrote: The very fact that Kevin and yourself bring up different reasons for why we need this feature makes me nervous. Yes, no question. For count(*), you don't care about the indexed values, only the count, while for Kevin's case you are reading values from the index. [sigh] I'm reluctant to draw out this digression further, but there is a possibly-useful point to be made here: these are not two different things. A covering index can be considered whenever the set of columns referenced in the query is contained inside the set of columns in the index. The fact that the set of columns needed by count(*) is the empty set merely means that it is covered by any index, since the empty set is contained in every set. Now, this special case may make for an easy initial target in implementation, or allow early benchmarking. If so, all the better to go there first. I'm not sure why anyone would stop there, though; if it pays off for that simple case it is likely to pay off for the more general case, too. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 10.05.2011 20:15, Simon Riggs wrote: I can picture that. Regrettably, I can also picture the accesses to the visibility map, the maintenance operations on the VM that are needed for this and the contention that both of those will cause. I agree that we need to do tests to demonstrate that there's a gain from the patch, once we have a patch to test. I would be very surprised if there isn't, but that just means the testing is going to be easy. I think Simon's point is that showing a gain on specific test cases isn't a sufficient argument. What we need to know about this sort of change is what is the distributed overhead that is going to be paid by *everybody*, whether their queries benefit from the optimization or not. And what fraction of real-world queries really do benefit, and to what extent. Isolated test cases (undoubtedly chosen to show off the optimization) are not adequate to form a picture of the overall cost and benefit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fix for bug in ldapServiceLookup in libpq
I have found a small but annoying bug in libpq where connection parameters are resolved via LDAP. There is a write past the end of a malloc'ed string which causes memory corruption. The code and the bug are originally by me :^( The attached patch fixes the problem in HEAD. This should be backpatched to 8.2 where the code was introduced. Yours, Laurenz Albe ldapServiceLookup.patch Description: ldapServiceLookup.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
Tom Lane wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 10.05.2011 20:15, Simon Riggs wrote: I can picture that. Regrettably, I can also picture the accesses to the visibility map, the maintenance operations on the VM that are needed for this and the contention that both of those will cause. I agree that we need to do tests to demonstrate that there's a gain from the patch, once we have a patch to test. I would be very surprised if there isn't, but that just means the testing is going to be easy. I think Simon's point is that showing a gain on specific test cases isn't a sufficient argument. What we need to know about this sort of change is what is the distributed overhead that is going to be paid by *everybody*, whether their queries benefit from the optimization or not. And what fraction of real-world queries really do benefit, and to what extent. Isolated test cases (undoubtedly chosen to show off the optimization) are not adequate to form a picture of the overall cost and benefit. Yes, I assume we are going to need the same kind of tests we did for other invasive patches like serializable isolation level and hot standby. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PGC_S_DEFAULT is inadequate
Greg Stark gsst...@mit.edu writes: On Wed, May 11, 2011 at 3:20 AM, Tom Lane t...@sss.pgh.pa.us wrote: So this is fine if the current value was from the file or was the boot_val, but if we'd overridden the boot value with a replacement default value using PGC_S_DEFAULT, that code would cause the value to revert to the boot_val not the replacement value. Not desirable. Doesn't this mean if you had a setting of timezone in your config file and remove it and reload you'll get a different value than you would have if you had actually booted without the line originally? Yes, except it's timezone_abbreviations (and client_encoding) that are at risk. I think the latter is probably observably broken in existing releases, though I've not tried to test. Given the minuscule use-case for setting client_encoding in postgresql.conf, I'm not concerned about back-patching a fix, but I'd like to make it work better going forward. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v5
On Tue, May 10, 2011 at 11:59 AM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, May 9, 2011 at 5:12 PM, Merlin Moncure mmonc...@gmail.com wrote: I'd like to know if this is a strategy that merits further work...If anybody has time/interest that is. It's getting close to the point where I can just post it to the commit fest for review. In particular, I'm concerned if Tom's earlier objections can be satisfied. If not, it's back to the drawing board... I'm interested in what you're doing here. From here, there's quite a lot of tuning possibilities. It would be very useful to be able to define some metrics we are interested in reducing and working out how to measure them. Following are results that are fairly typical of the benefits you might see when the optimization kicks in. The attached benchmark just creates a bunch of records in a random table and scans it. This is more or less the scenario that causes people to grip about hint bit i/o, especially in systems that are already under moderate to heavy i/o stress. I'm gonna call it for 20%, although it could be less if you have an i/o system that spanks the test (try cranking -c and the creation # records in bench.sql in that case). Anecdotal reports of extreme duress caused by hint bit i/o suggest problematic or mixed use (OLTP + OLAP) workloads might see even more benefit. One thing I need to test is how much benefit you'll see with wider records. I think I'm gonna revert the change to cache invalid bits. I just don't see hint bits as a major contributor to dead tuples following epic rollbacks (really, the solution for that case is simply to try and not get in that scenario if you can). This will put the code back into the cheaper and simpler bit per transaction addressing. What I do plan to do though, is to check and set xmax commit bits in the cache...that way deleted tuples will see cache benefits. [hbcache] merlin@mmoncure-ubuntu:~$ time pgbench -c 4 -n -T 200 -f bench.sql transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 4 number of threads: 1 duration: 200 s number of transactions actually processed: 8 tps = 0.037167 (including connections establishing) tps = 0.037171 (excluding connections establishing) real3m35.549s user0m0.008s sys 0m0.004s [HEAD] merlin@mmoncure-ubuntu:~$ time pgbench -c 4 -n -T 200 -f bench.sql transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 4 number of threads: 1 duration: 200 s number of transactions actually processed: 8 tps = 0.030313 (including connections establishing) tps = 0.030317 (excluding connections establishing) real4m24.216s user0m0.000s sys 0m0.012s bench.sql Description: Binary data bench_setup.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VARIANT / ANYTYPE datatype
On Wed, May 11, 2011 at 7:53 AM, Robert Haas robertmh...@gmail.com wrote: That's likely to be how it gets implemented, but you seem to have missed the point of some of the discussion upthread: the big problem with that is that someone might type DROP TYPE foo, and when they do, you need an efficient way to figure out whether foo is in use inside an instance of the variant type anywhere in the system. The devil is in the details... Sorry, I missed that. That in mind, I think I would lean more toward the union proposal as well. Can anyone think of a case where VARIANT would be more useful? As for using one or two bytes to store the type of a UNION, that creates a problem when you want to extend the union in the future. That is, if a UNION is simply a collection of possible types values of the UNION type can hold. If UNION is implemented more like a tagged union: CREATE TYPE token AS TAGGED UNION (identifier TEXT, keyword TEXT, number INT); Then the problem of altering it is much like the problem of altering an ENUM. On Tue, May 10, 2011 at 5:19 PM, Darren Duncan dar...@darrenduncan.net wrote: Examples of open union types could be number, which all the numeric types compose, and so you can know say that you can use the generic numeric operators on values you have simply if their types compose the number union type, and it still works if more numeric types appear later. Likewise, the string open union could include both text and blob, as both support catenation and substring matches or extraction, for example. This would aid to operator overloading in a generic way, letting you use the same syntax for different types, but allowing types to mix is optional; eg, you could support add(int,int) and add(real,real) without supporting add(int,real) etc but the syntax add(x,y) is shared, and you do this while still having a strong type system; allowing the mixing is optional case-by-case. Coming from a Haskell perspective, this is a great idea, but I don't think the union feature should be used to implement it. Closed unions correspond to algebraic data types in Haskell, e.g.: data Ordering = LT | EQ | GT while open unions are better-suited to type classes: (+) :: (Num a) = a - a - a I, for one, would like to see PostgreSQL steal some features from Haskell's type system. PostgreSQL seems to implement a subset of Haskell's system, without type classes and where functions can have only one type variable (anyelement). To express the (+) example in PostgreSQL, it would be tempting to simply say: add(real, real) returns real However, what if each real is a different type (e.g. INT and FLOAT). Is that allowed? In the Haskell example above, (+) constraints both of its arguments to the same type. In ad-hoc syntax, it would look like this in PostgreSQL: real anyelement = add(anyelement, anyelement) returns anyelement Another thing to consider: attempting to use a type class as a column type, e.g.: CREATE TABLE foo (n real); Normally in Haskell, type information is passed implicitly as parameters (hence the term parametric polymorphism), rather than carried alongside values (like in object-oriented languages). In the case above, the type information would have to be carried with each value. Haskell actually supports this, but under a somewhat-weird extension called Existential types (see http://www.haskell.org/haskellwiki/Existential_type#Examples for an example). It isn't terribly useful in Haskell, and I don't think it will be in PostgreSQL either. Joey Adams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix for bug in ldapServiceLookup in libpq
Albe Laurenz laurenz.a...@wien.gv.at writes: I have found a small but annoying bug in libpq where connection parameters are resolved via LDAP. There is a write past the end of a malloc'ed string which causes memory corruption. The code and the bug are originally by me :^( Hmm ... that's a bug all right, but why have the null-termination inside the loop at all? Seems like it should look like for (p = result, i = 0; values[i] != NULL; ++i) { strncpy(p, values[i]-bv_val, values[i]-bv_len); p += values[i]-bv_len; *(p++) = '\n'; } *p = '\0'; This should be backpatched to 8.2 where the code was introduced. Yes, will do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
Tom Lane t...@sss.pgh.pa.us wrote: I think Simon's point is that showing a gain on specific test cases isn't a sufficient argument. Ah, if that's what he's been trying to get at, I'm curious who disagrees with that. I wouldn't have thought anyone on this list would. What we need to know about this sort of change is what is the distributed overhead that is going to be paid by *everybody*, whether their queries benefit from the optimization or not. Certainly we need to test whether Heikki is right in the previously non-quoted part of his post on this thread: Note that we already have the visibility map, and the accesses needed to update it are already there. Granted, we'll have to change the logic slightly to make it crash safe, but I don't expect that to add any meaningful overhead - the changes are going to be where the bits are set, ie. vacuum, not when the bits are cleared. Granted, we might also want to set the bits more aggressively once they're used by index-only-scans. But done correctly, just taking advantage of the VM that's already there shouldn't add overhead to other operations. Isolated test cases (undoubtedly chosen to show off the optimization) are not adequate to form a picture of the overall cost and benefit. Well, first, that hardly seems fair. I have many times seen people make an effort to synthesize *worst* case benchmarks. Certainly any regular on this list would know it is pointless to show only a best case benchmark. Second, we really need to make development of a performance testing farm a priority at PGCon next week. The need for it just keeps coming up over and over. Third, Dan Ports has been working a great deal with DBT-2 running PostgreSQL for the SSI patch, both as a stress tool to flush out bugs and to get benchmarks numbers conforming to the published requirements of that benchmark. I know from off-list emails that it took a fair amount of work to get it running correctly with PostgreSQL in his environment. We should probably try to draw on that experience. (Of course that shouldn't be the *only* test in a performance testing farm, but it's a good one to include.) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VARIANT / ANYTYPE datatype
Excerpts from Simon Riggs's message of mar may 10 17:57:20 -0400 2011: On Wed, May 4, 2011 at 5:58 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote: both Oracle and MS-SQL have it Do they? What types are they called? ANYTYPE -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VARIANT / ANYTYPE datatype
On Wed, May 11, 2011 at 11:43 AM, Joseph Adams joeyadams3.14...@gmail.com wrote: On Tue, May 10, 2011 at 5:19 PM, Darren Duncan dar...@darrenduncan.net wrote: Examples of open union types could be number, which all the numeric types compose, and so you can know say that you can use the generic numeric operators on values you have simply if their types compose the number union type, and it still works if more numeric types appear later. Likewise, the string open union could include both text and blob, as both support catenation and substring matches or extraction, for example. This would aid to operator overloading in a generic way, letting you use the same syntax for different types, but allowing types to mix is optional; eg, you could support add(int,int) and add(real,real) without supporting add(int,real) etc but the syntax add(x,y) is shared, and you do this while still having a strong type system; allowing the mixing is optional case-by-case. Coming from a Haskell perspective, this is a great idea, but I don't think the union feature should be used to implement it. I'm unclear what the point of such a feature would be. A union of all the common numeric types is not much different from the existing type numeric. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix for bug in ldapServiceLookup in libpq
... btw, shouldn't this function free the result string when it's done with it? AFAICS that string is not returned to the caller, it's just being leaked. (I'll refrain from asking why it's creating the string in the first place rather than parsing ldap_get_values_len's output as-is ...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v5
On Wed, May 11, 2011 at 10:38 AM, Merlin Moncure mmonc...@gmail.com wrote: One thing I need to test is how much benefit you'll see with wider records. The results are a bit better, around 25% using a similar methodology on ~ 1k wide records. I think I'm gonna revert the change to cache invalid bits. I just don't see hint bits as a major contributor to dead tuples following epic rollbacks what I meant to say here was, I don't see hint bit i/o following rollbacks as a major issue. Point being, I don't see much use in optimizing management of INVALID tuple bits beyond what is already done. Anyways, demonstrating a 'good' case is obviously not the whole story. But what are the downsides? There are basically two: 1) tiny cpu penalty on every heap fetch 2) extremely widely dispersed (in terms of transaction id) unhinted tuples can force the cache to refresh every 100 tuples in the absolute worst case. A cache refresh is a 100 int sort and a loop. For '1', the absolute worst case I can come up with, cpu bound scans of extremely narrow records, the overall cpu usage goes up around 1%. '2' seems just impossible to see in the real world -- and if it does, you are also paying for lots of clog lookups all the way through the slru, and you are having i/o and other issues on top of it. Even if all the stars align and it does happen, all the tuples get hinted and dirtied anyways so it will only happen at most once on that particular set of data. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Standbys which don't synch to disk?
Fujii, Simon, For 9.1, both master and replica in a sync replication relationship are required to be fsync'ing to disk. I understand why we had to do that for our first cut at synch rep. Do you think, though, that it might become possible to replicate without synch-to-disk for 9.2? The use case I have is cloud hosting, where I'd rather have two or three synchronous standbys than synch to disk. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
On Wed, May 11, 2011 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote: For 9.1, both master and replica in a sync replication relationship are required to be fsync'ing to disk. I understand why we had to do that for our first cut at synch rep. Do you think, though, that it might become possible to replicate without synch-to-disk for 9.2? The use case I have is cloud hosting, where I'd rather have two or three synchronous standbys than synch to disk. It's already possible to set fsync=off on the standby if you want. If there is an OS-level crash you'll need to rebuild the standby, but in some cases that may be acceptable. And Simon has already written a patch to add a receive mode to sync rep, which I expected will get committed to 9.2. In that mode, the standby can acknowledge the WAL records as soon as they are received, and write them to disk just after. I think we do need some benchmarking there, to figure out whether any changes to the timing of replies are needed in that case. But the basic principal seems sound. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
Andres Freund and...@anarazel.de writes: The grammar accepts only a very limited amount of parameters there: Err TriggerFuncArg: Iconst { char buf[64]; snprintf(buf, sizeof(buf), %d, $1); $$ = makeString(pstrdup(buf)); } | FCONST{ $$ = makeString($1); } | Sconst{ $$ = makeString($1); } | BCONST{ $$ = makeString($1); } | XCONST{ $$ = makeString($1); } | ColId { $$ = makeString($1); } That is integers, floating point, strings, bitstrings, hexstrings and column references (???). How that exact list came to exist I do not know. The documentation for CREATE FUNCTION says arguments: An optional comma-separated list of arguments to be provided to the function when the trigger is executed. The arguments are literal string constants. Simple names and numeric constants can be written here, too, but they will all be converted to strings. The ColId case is meant to cover the simple names proviso, but of course it fails to cover reserved words. We could trivially fix that by writing ColLabel instead of ColId. My initial expectation was that this would bloat the parser, but it seems not to: the backend gram.o is exactly the same size after making the change, and ecpg's preproc.o actually gets smaller (more opportunity to share states?). So I'm inclined to do it, rather than having to document that simple names excludes reserved words. A possibly more interesting question is why BCONST and XCONST were added there. The documentation certainly does not say or suggest that those are legal options, and what's more the behavior could be considered surprising: regression=# CREATE TRIGGER trig_x_bconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(b'1011'); CREATE TRIGGER regression=# CREATE TRIGGER trig_x_xconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(x'1234abcd'); CREATE TRIGGER regression=# \d+ x ... Triggers: trig_x_bconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('b1011') trig_x_xconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('x1234abcd') I'm inclined to take those out, because (1) I find it shrinks the generated grammar a tad (these productions *do* add to the size of the state tables), and (2) if we don't, we ought to document this behavior, and I don't want to do that either. I see this as just a change to make in HEAD, it's not appropriate for a back-patch. Objections anyone? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
It's already possible to set fsync=off on the standby if you want. If there is an OS-level crash you'll need to rebuild the standby, but in some cases that may be acceptable. Yes, generally if there's an OS-level crash on cloud hosting, you've lost the instance anyway. And Simon has already written a patch to add a receive mode to sync rep, which I expected will get committed to 9.2. In that mode, the standby can acknowledge the WAL records as soon as they are received, and write them to disk just after. I think we do need some benchmarking there, to figure out whether any changes to the timing of replies are needed in that case. But the basic principal seems sound. Yes, that's what I'm looking for. The one other thing would be the ability not to fsync the master, which would come out of the whole stream from buffers patch which Fujii was working on. Fujii, is that still something you're working on? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade and PGPORT
pg_upgrade is a bit schizophrenic concerning the PGPORT environment variable. On the one hand, there is this code in option.c that wants to make use of it: old_cluster.port = getenv(PGPORT) ? atoi(getenv(PGPORT)) : DEF_PGPORT; new_cluster.port = getenv(PGPORT) ? atoi(getenv(PGPORT)) : DEF_PGPORT; On the other hand, check.c will reject a set PGPORT because it's a libpq environment variable. Should we make an exception for PGPORT, like we did for PGCLIENTENCODING? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v5
On Wed, May 11, 2011 at 4:38 PM, Merlin Moncure mmonc...@gmail.com wrote: Following are results that are fairly typical of the benefits you might see when the optimization kicks in. The attached benchmark just [hbcache] real 3m35.549s [HEAD] real 4m24.216s These numbers look very good. Thanks for responding to my request. What people have said historically at this point is ah, but you've just deferred the pain from clog lookups. The best way to show this does what we hope is to run a normal-ish OLTP access to the table that would normally thrash the clog and show no ill effects there either. Run that immediately after the above tests so that the cache and hint bits are both primed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and PGPORT
Peter Eisentraut wrote: pg_upgrade is a bit schizophrenic concerning the PGPORT environment variable. On the one hand, there is this code in option.c that wants to make use of it: old_cluster.port = getenv(PGPORT) ? atoi(getenv(PGPORT)) : DEF_PGPORT; new_cluster.port = getenv(PGPORT) ? atoi(getenv(PGPORT)) : DEF_PGPORT; On the other hand, check.c will reject a set PGPORT because it's a libpq environment variable. Should we make an exception for PGPORT, like we did for PGCLIENTENCODING? Wow, good question. Passing stuff in via libpq is certainly complex. I ran a test and it looks like the command-line flag overrides the PGPORT environment variable: $ export PGPORT= $ psql test psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket /tmp/.s.PGSQL.? $ psql -p 5432 test psql (9.1beta1) Type help for help. test= I assume it is just like PGCLIENTENCODING. PGCLIENTENCODING was easier to ignore because we need it for error messages. Are there other cases we should allow too? A larger question is whether we should just disable all the checks for environment variables. The C comment says: * check_for_libpq_envvars() * * tests whether any libpq environment variables are set. * Since pg_upgrade connects to both the old and the new server, * it is potentially dangerous to have any of these set. * * If any are found, will log them and cancel. I am not sure what to do. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
On Wednesday, May 11, 2011 07:25:58 PM Tom Lane wrote: Andres Freund and...@anarazel.de writes: The grammar accepts only a very limited amount of parameters there: Err TriggerFuncArg: Iconst { char buf[64]; snprintf(buf, sizeof(buf), %d, $1); $$ = makeString(pstrdup(buf)); } | | FCONST{ $$ = | makeString($1); } Sconst{ | $$ = makeString($1); } BCONST | { $$ = makeString($1); } XCONST | { $$ = makeString($1); } ColId | { $$ = makeString($1); } That is integers, floating point, strings, bitstrings, hexstrings and column references (???). How that exact list came to exist I do not know. The documentation for CREATE FUNCTION says arguments: An optional comma-separated list of arguments to be provided to the function when the trigger is executed. The arguments are literal string constants. Simple names and numeric constants can be written here, too, but they will all be converted to strings. The ColId case is meant to cover the simple names proviso, but of course it fails to cover reserved words. We could trivially fix that by writing ColLabel instead of ColId. My initial expectation was that this would bloat the parser, but it seems not to: the backend gram.o is exactly the same size after making the change, and ecpg's preproc.o actually gets smaller (more opportunity to share states?). So I'm inclined to do it, rather than having to document that simple names excludes reserved words. Good. A possibly more interesting question is why BCONST and XCONST were added there. The documentation certainly does not say or suggest that those are legal options, and what's more the behavior could be considered surprising: It seems to have originally been added there by Peter (as BITCONST) and then split by Thomas Lockhart. See 73874a06 and eb121ba2 regression=# CREATE TRIGGER trig_x_bconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(b'1011'); CREATE TRIGGER regression=# CREATE TRIGGER trig_x_xconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x(x'1234abcd'); CREATE TRIGGER regression=# \d+ x ... Triggers: trig_x_bconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('b1011') trig_x_xconst BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE trigger_x('x1234abcd') Err. Yes, that looks rather strange. And surprising. I'm inclined to take those out, because (1) I find it shrinks the generated grammar a tad (these productions *do* add to the size of the state tables), and (2) if we don't, we ought to document this behavior, and I don't want to do that either. I see this as just a change to make in HEAD, it's not appropriate for a back-patch. I would say the above behaviour even is a bug. But given that I haven't seen/found anybody complaining about it fixing it properly looks pointless. So yes, HEAD only sounds fine. Objections anyone? Nope. Is there a special reason for not using the normal function calling mechanisms? It looks to me as it was just done to have an easy way to store it in pg_trigger.tgargs. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
Josh Berkus j...@agliodbs.com writes: It's already possible to set fsync=off on the standby if you want. If there is an OS-level crash you'll need to rebuild the standby, but in some cases that may be acceptable. ... The one other thing would be the ability not to fsync the master, which would come out of the whole stream from buffers patch which Fujii was working on. Fujii, is that still something you're working on? Huh? Surely you can just set fsync=off on the master if you feel like it. Data integrity not guaranteed, of course, but if you don't care... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] potential bug in trigger with boolean params
Andres Freund and...@anarazel.de writes: Is there a special reason for not using the normal function calling mechanisms? It looks to me as it was just done to have an easy way to store it in pg_trigger.tgargs. Well, this is all very historical, dating from Berkeley days AFAIK. If we had it to do over, I bet we'd do it differently --- but the pain of changing it seems to exceed any likely benefit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and PGPORT
Bruce Momjian br...@momjian.us writes: A larger question is whether we should just disable all the checks for environment variables. The C comment says: * check_for_libpq_envvars() * * tests whether any libpq environment variables are set. * Since pg_upgrade connects to both the old and the new server, * it is potentially dangerous to have any of these set. * * If any are found, will log them and cancel. I am not sure what to do. Well, the risk mentioned in that comment certainly seems real. An alternative solution that might be more user-friendly is to ensure that the connection strings pg_upgrade uses specify all important options, leaving nothing to be overridden by environment variables. Then you don't need to make the user adjust his environment. Or you could just unsetenv instead of complaining. I would like to think that eventually pg_upgrade won't start a postmaster at all, but connect using something more like a standalone backend. So someday the issue might go away --- but that someday isn't especially close. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and PGPORT
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: A larger question is whether we should just disable all the checks for environment variables. The C comment says: * check_for_libpq_envvars() * * tests whether any libpq environment variables are set. * Since pg_upgrade connects to both the old and the new server, * it is potentially dangerous to have any of these set. * * If any are found, will log them and cancel. I am not sure what to do. Well, the risk mentioned in that comment certainly seems real. An alternative solution that might be more user-friendly is to ensure that the connection strings pg_upgrade uses specify all important options, leaving nothing to be overridden by environment variables. Then you don't need to make the user adjust his environment. Well, they can use the same port number for both servers. In fact I just use the compiled-default of 5432 when I am testing. No reason they could not supply value in an environment variable but it would have to be the same for old and new server (the two servers never run at the same time). Or you could just unsetenv instead of complaining. I think it is really PGDATA that we certainly can't inherit from an environment variable. I would like to think that eventually pg_upgrade won't start a postmaster at all, but connect using something more like a standalone backend. So someday the issue might go away --- but that someday isn't especially close. That standalone backend is going to have to understand pg_dump SQL output, with \connect, etc. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v5
On Wed, May 11, 2011 at 12:40 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, May 11, 2011 at 4:38 PM, Merlin Moncure mmonc...@gmail.com wrote: Following are results that are fairly typical of the benefits you might see when the optimization kicks in. The attached benchmark just [hbcache] real 3m35.549s [HEAD] real 4m24.216s These numbers look very good. Thanks for responding to my request. What people have said historically at this point is ah, but you've just deferred the pain from clog lookups. Deferred, or eliminated. If any tuple on the page gets updated, deleted, etc or the the table itself is dropped then you've 'won'...the page with rhw hint bit only change was never booted out to the heap before another substantive change happened. This is exactly what happens in certain common workloads -- you insert a bunch of records, scan them with some batch process, then delete them. Let's say a million records were inserted under a single transaction and you are getting bounced out of the transam.c cache, you just made a million calls to TransactionIdIsCurrentTransactionId and (especially) TransactionIdIsInProgress for the *exact same* transaction_id, over and over. That stuff adds up even before looking at the i/o incurred. Put another way, the tuple hint bits have a lot of usefulness when the tuples on the page are coming from all kinds of differently aged transactions. When all the tuples have the same or similar xid, the information value is quite low, and the i/o price isn't worth it. The cache neatly haircuts the downside case. If the cache isn't helping (any tuple fetch on the page faults through it), the page is dirtied and the next time it's fetched all the bits will be set. The best way to show this does what we hope is to run a normal-ish OLTP access to the table that would normally thrash the clog and show no ill effects there either. Run that immediately after the above tests so that the cache and hint bits are both primed. yeah. the only easy way I know of to do this extremely long pgbench runs, and getting good results is harder than it sounds...if the tuple hint bits make it to disk (via vacuum or a cache fault), they stay there and that tuple is no longer interesting from the cache point of view. If you make the scale really large the test will just take forever just to get the tables primed (like a week). Keep in mind, autovacuum can roll around at any time and set the bits under you (you can of course disable it, but who really does than on OLTP?). Small scale oltp tests are not real world realistic because anybody sane would just let autovacuum loose on the table. clog thrashing systems are typically mature, high load oltp databases...not fun to test on your single 7200 rpm drive. I'm going to boldly predict that with all the i/o flying around in cases like that, the paltry cpu cycles spent dealing with the cache are the least of your problems. Not discounting the need to verify that though. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Tables cannot have INSTEAD OF triggers
Why not? Is there a fundamental problem, or just that no one wanted to make it work? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and PGPORT
On Wed, May 11, 2011 at 2:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Or you could just unsetenv instead of complaining. +1 for that. I would like to think that eventually pg_upgrade won't start a postmaster at all, but connect using something more like a standalone backend. So someday the issue might go away --- but that someday isn't especially close. And +1 for that, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
Robert, That WAL has effectively disappeared from the master, but is still present on the slave. Now the master comes up and starts processing read-write transactions again, and generates a new and different 1kB of WAL. Hilarity ensues, because the two machines are now out of step with each other. Yeah, you'd need some kind of instant failover and STONITH. That is, any interruption on the master would be a failover situation. While that seems conceivable for crashes, consider that a planned restart of the master might be an issue, and an OOM-kill would certainly be. You could possibly fix this by making provision for the master to connect to the slave on start-up and stream WAL backwards from slave to master. That'd be pretty spiffy. Ouch, now you're making my head hurt. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unfriendly handling of pg_hba SSL options with SSL off
On Tue, May 10, 2011 at 05:39, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: Late reply, but we are basically ignoring 'local' lines if the build doesn't support unix domain sockets (windows), but throwing an error for hostssl usage if ssl is not compiled in. Is the only logic here that 'local' is part of the default pg_hba.conf and hostssl is not? Is that good logic? I wouldn't have a problem with making the Windows port throw an error for local lines. We'd have to fix initdb to remove that line from the sample file (if it doesn't already), but that's surely not hard. It does already (that's what the @remove-line-for-nolocal@ markup in the sample file is for). So +1 for making it throw an error. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tables cannot have INSTEAD OF triggers
Peter Eisentraut pete...@gmx.net writes: Why not? Is there a fundamental problem, or just that no one wanted to make it work? I'm fairly sure there was a substantive issue, but memory fails as to what it was. You could try removing the error check and see what breaks ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VARIANT / ANYTYPE datatype
Robert Haas wrote: On Wed, May 11, 2011 at 11:43 AM, Joseph Adams joeyadams3.14...@gmail.com wrote: On Tue, May 10, 2011 at 5:19 PM, Darren Duncan dar...@darrenduncan.net wrote: Examples of open union types could be number, which all the numeric types compose, and so you can know say that you can use the generic numeric operators on values you have simply if their types compose the number union type, and it still works if more numeric types appear later. Likewise, the string open union could include both text and blob, as both support catenation and substring matches or extraction, for example. This would aid to operator overloading in a generic way, letting you use the same syntax for different types, but allowing types to mix is optional; eg, you could support add(int,int) and add(real,real) without supporting add(int,real) etc but the syntax add(x,y) is shared, and you do this while still having a strong type system; allowing the mixing is optional case-by-case. Coming from a Haskell perspective, this is a great idea, but I don't think the union feature should be used to implement it. I'm unclear what the point of such a feature would be. A union of all the common numeric types is not much different from the existing type numeric. In the case of the open union numeric, one point is that users or extensions could come up with new types that add themselves to the union, for example say a type for complex numbers (I didn't see a built-in such). But I'm just citing numeric as an example; there would be a lot more in practice, potentially one for every individual type, so for example if operators were defined for the open union rather than for the base type, then users/extensions could define their own types and easily declare you can use it like this type but its different in some important way, which may just be an implementation difference. Operations that don't care about the differences can just be written against the open union type where they just work and those that do care can be more specific. Joseph Adams said: Coming from a Haskell perspective, this is a great idea, but I don't think the union feature should be used to implement it. Closed unions correspond to algebraic data types in Haskell, e.g.: data Ordering = LT | EQ | GT while open unions are better-suited to type classes: (+) :: (Num a) = a - a - a While closed unions would often be used for your first examlpe, I see they're still useful for type classes as well. Especially in the face of open unions being available, the closed unions let users say, no, I don't really want the meaning of this union type to change just because someone else declares a new type in scope (that adds itself to the union). For example, one could declare a system_numeric closed union type that only includes Pg built-in numerics and users of that can be confident that nothing about it will change later unless the definition of system_numeric itself or the types it unions are changed. But open unions would be preferred in places they wouldn't cause trouble, where you want to allow easier user extensibility. I, for one, would like to see PostgreSQL steal some features from Haskell's type system. PostgreSQL seems to implement a subset of Haskell's system, without type classes and where functions can have only one type variable (anyelement). I think that Haskell and other functional languages have a lot to teach relational DBMSs and I see them as being highly compatible. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
On Wed, May 11, 2011 at 10:22:01AM -0400, Tom Lane wrote: Noah Misch n...@leadboat.com writes: On Mon, May 09, 2011 at 11:32:28PM -0400, Tom Lane wrote: So we basically had three alternatives to make it better: * downcast to the array type, which would possibly silently break applications that were relying on the function result being considered of the domain type * re-apply domain checks on the function result, which would be a performance hit and possibly again result in unobvious breakage * explicitly break it by throwing a parse error until you downcast (and then upcast the function result if you want) I realize that #3 is a bit unpleasant, but are either of the other two better? At least #3 shows you where you need to check for problems. Though I've never used a domain over an array type, I'd strongly prefer #2. Hmm. I hadn't seriously considered that alternative, but we could go in that direction. Logically, what this would probably imply is inserting CastToDomain whenever the result of a polymorphic function is deemed to be of a domain type, whether the base type is array or not. The reason I hadn't taken it very seriously is that I don't think it's actually going to end up being consistent. If we don't do #1 (downcast polymorphic arguments to a base type), but consider the arguments passed to the function to be of the domain type, then really we have to expect the polymorphic function to enforce domain constraints internally; we cannot fix it with something as localized as having the function call parser stick a CastToDomain on top. Here's a possibly rather silly example: create function negate(anyelement) returns anyelement as $$ select - $1 $$ language sql; create domain pos as int check (value 0); select negate(42::pos); This negate() function will work for any type that has a unary minus operator. But the result of the unary minus operation cannot sanely be considered to be of this domain type. While simple, I think that example covers the salient features. The git master behavior is sound: ERROR: return type mismatch in function declared to return pos DETAIL: Actual return type is integer. If you defined a unary minus operator for the domain type itself, the function implementing that operator would then gain responsibility for preserving any domain constraints. Now, perhaps it's unfortunate that the example can't be easily rewritten to actually work for arbitrary domain inputs (without rewriting it in C). I don't have any particular ideas for improving that. In this simplified example you might feel it doesn't matter, since with an external CastToDomain we'd throw error anyway a moment later, as soon as control comes back from the function. But what if the function does further operations with the value, such as passing it to another polymorphic function? The SQL PL understands that the value of -($1::pos) has type integer, as does PL/pgSQL. I'm not seeing any problems offhand. So really, if you go down this path, you end up concluding that PLs supporting polymorphic arguments had better be prepared to enforce domain constraints all the way through, and thus there should be no need for an external CastToDomain --- what comes back from the function ought to be checked already. That was my conclusion. I'm not aware of any particular holes in this area, but I won't wager there are none. The code that produces the domain-typed datums is not usually part of the PL implementation, so the PL mostly needs to rigorously track the provenance of its datums. Unfortunately, even if the PLs do that (SQL functions might get it right, but I'm not real sure whether plpgsql is water-tight on this, and I don't trust the other PLs for it at all), there's no way that built-in polymorphic functions like array_append are going to. So on the whole, #2 looks like an implementation quagmire to me: it's not clear what to check, or where, or how you know when you're done. Every C function is responsible for returning a datum consistent with its declared return type: we do nothing in particular to ensure that a function has done so. I see domain constraints as a natural extension of that rule. If a C function purports to return a domain-typed datum, perhaps through polymorphism, it's responsible for having checked the datum against that domain. Every PL accepts responsibility for enforcing this on behalf of its functions, much as it enforces general consistency between return types and actual returned datums. A function with a polymorphic argument returning the same polymorphic type is consequently responsible for applying domain checks when the concrete type is a domain. (A function with an anyelement argument that returns anyarray is not a problem, because the return type is never a domain. Likewise for a function with an anyarray argument that returns
Re: [HACKERS] VARIANT / ANYTYPE datatype
To follow-up my earlier comments ... I suspect for practical purposes we may want to limit the scope of some type features. For example, the greatest benefits for open union / mixin types is with routines/operators, not so much with tables. So, Pg could choose to support open unions but only for routines, where the declared types of table columns must still be other types. Then you don't have to worry about searching your data for where one might be used, but only search your system catalog for routines/operators that use it. But closed unions should be supported in tables too. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to allow domains over composite types
On Wed, May 11, 2011 at 8:07 AM, Yeb Havinga yebhavi...@gmail.com wrote: typecmds.c says: Domains over composite types might be made to work in the future, but not today. Attached is a patch that allows domains over composite types, together with test cases in domaincomp.sql. A domain over a composite type has typtype TYPTYPE_DOMAIN, but typrelid and typrelkind are empty: that information is only available in the pg_type record of the base type. The remainder of the patch follows from that choice. While parsing a record expression into a row type, an extra coercion node had to be inserted to ensure that the domain checks are called. All regression tests are ok, comments are highly appreciated. I don't have time to look at this right now, but please add it to the currently-open CommitFest: https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collation mega-cleanups
On mån, 2011-05-09 at 14:58 -0400, Bruce Momjian wrote: Tom this collation stuff has seen more post-feature-commit cleanups than I think any patch I remember. Is there anything we can learn from this? Don't do big patches? Seriously, it looks pretty bad, but this is one of the biggest feature patches in the last 5 years, it touches many places all over the system, and there is a reason why this topic has been on the TODO list for 10 years: it's overwhelming. I had aimed for a 75% solution: have something that supports useful cases, that doesn't break anything if you don't use it, and that can be expanded later. Now maybe I only reached 70%, and maybe the baseline should have been 80%, but what we now have is more like 107% and includes a handful of features I had explicitly excluded from the first round. The patch has been around for 10 months, it has been in every commit fest, it has tests and documentation, it has been reviewed a bunch of times, people evidently read (some of) the code, they gave feedback, adjustments have been made (some reverted during later cleanup, go figure), performance was questioned, performance tests were done, adjustments were made, people told me to commit it, so I did, if people had told me to revert it, I would have reverted it. What can we learn from that? The bigger your patch, the lonelier you are. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
On tis, 2011-05-10 at 18:05 -0400, Tom Lane wrote: The lack of initdb support for getting more-or-less-standard collation entries into pg_collation on Windows seems to be the major missing piece from here (dunno if Peter is aware of others). If we don't fix that before release, we're going to regret it anyway IMO, because of the inevitable tide of questions/complaints from Windows users trying to use the collation feature. We've already seen at least one such from a beta tester. Well, someone who wants it will have to do it. It's pretty simple, but not simple enough to code it blindly. If someone wants to do it, I can tell them exactly what to do. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collation mega-cleanups
Peter Eisentraut wrote: from that? The bigger your patch, the lonelier you are. I can attest to that. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collation mega-cleanups
Peter Eisentraut pete...@gmx.net writes: Seriously, it looks pretty bad, but this is one of the biggest feature patches in the last 5 years, it touches many places all over the system, and there is a reason why this topic has been on the TODO list for 10 years: it's overwhelming. Yeah. I did not want to press for reverting, because it seemed clear to me that there was no way that this feature would ever get in if we insisted that it be 100% right when committed. My idea of good enough kept moving the more I looked at the patch, though, and it's still moving --- now I think that we really need to fix the lack of preloaded pg_collation entries for Windows, and then get in a regression test that runs everywhere. So if you want to call that feature creep, go ahead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
Peter Eisentraut pete...@gmx.net writes: On tis, 2011-05-10 at 18:05 -0400, Tom Lane wrote: The lack of initdb support for getting more-or-less-standard collation entries into pg_collation on Windows seems to be the major missing piece from here (dunno if Peter is aware of others). If we don't fix that before release, we're going to regret it anyway IMO, because of the inevitable tide of questions/complaints from Windows users trying to use the collation feature. We've already seen at least one such from a beta tester. Well, someone who wants it will have to do it. It's pretty simple, but not simple enough to code it blindly. If someone wants to do it, I can tell them exactly what to do. Hm, do you know how to enumerate the available locales on Windows? (Still not volunteering, since I couldn't test it, but that's the only missing piece of information AFAIK.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VARIANT / ANYTYPE datatype
Darren Duncan dar...@darrenduncan.net writes: But I'm just citing numeric as an example; there would be a lot more in practice, potentially one for every individual type, so for example if operators were defined for the open union rather than for the base type, then users/extensions could define their own types and easily declare you can use it like this type but its different in some important way, which may just be an implementation difference. Operations that don't care about the differences can just be written against the open union type where they just work and those that do care can be more specific. I'm just an old-school abstract data type hacker, but I don't see anything in what you're saying that doesn't work today in our existing type system: with overloaded and/or polymorphic operators and functions you can get all those effects. Maybe there would be some small gain in ability to share code for tasks that fall between single-data-type and works-for-anything cases, but it looks like a pretty marginal improvement from here; probably not worth the cost and compatibility implications of a major overhaul of the type system. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Apr 28, 2011, at 2:16 PM, David E. Wheeler wrote: So maybe it's half-assed. Maybe the version can be anything but the revision must be an integer. Maybe there's a `pg_extension_version($extension_name)` function that returns ARRAY[$version, $revision], and the revision is set in the control file but not included in the version or in the upgrade file names. I think I can live with that. But, hell, you're halfway to mandating the meaning by doing this. Will we have to go the rest of the way in the future? Okay, how we add a revision key to the control file and extrevision to the pg_extension catalog. Its type can be TEXT and is optional for use by extensions. This would allow extension authors to identify the base version of an extension but also the revision. And the core doesn't have to care how it works or if it's used, but it would allow users to know exactly what they have installed. Thoughts? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
On ons, 2011-05-11 at 16:47 -0400, Tom Lane wrote: Hm, do you know how to enumerate the available locales on Windows? EnumSystemLocalesEx() Reference: http://msdn.microsoft.com/en-us/library/dd317829(v=vs.85).aspx Example: http://msdn.microsoft.com/en-us/library/dd319091(v=vs.85).aspx As you can see in the example, this returns names like en-US and es-ES. I would imagine we normalize this to the usual en_US, es_ES (but we could also install the not normalized names, just like we install en_US.utf8). But you need to rearrange the code in initdb a bit because this thing works with callbacks. There is an older interface EnumSystemLocales() which returns locale IDs, which you then have to look up and convert into a name manually. There is code for that in the old installer CVS on pgfoundry. But it's very ugly, so I'd rather skip that and just concentrate on supporting the newer interface. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
Peter Eisentraut pete...@gmx.net writes: On ons, 2011-05-11 at 16:47 -0400, Tom Lane wrote: Hm, do you know how to enumerate the available locales on Windows? EnumSystemLocalesEx() Reference: http://msdn.microsoft.com/en-us/library/dd317829(v=vs.85).aspx Example: http://msdn.microsoft.com/en-us/library/dd319091(v=vs.85).aspx Doesn't look too bad ... There is an older interface EnumSystemLocales() which returns locale IDs, which you then have to look up and convert into a name manually. There is code for that in the old installer CVS on pgfoundry. But it's very ugly, so I'd rather skip that and just concentrate on supporting the newer interface. I guess the question is what happens on pre-Vista Windows if we use EnumSystemLocalesEx. I don't object to just not populating pg_collation in that case, but we probably don't want it to fail entirely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] performance-test farm
Hi everyone, several members of this mailing list mentioned recently it'd be really useful to have a performance-test farm, that it might improve the development process and make some changes easier. I've briefly discussed this with another CSPUG member, who represents a local company using PostgreSQL for a long time (and that supports CSPUG), and we've agreed to investigate this a bit further. I do have a rough idea what it might look like, but I've never built performance-testing farm for such distributed project. So I'd like to know what would you expect from such beast. Especially 1) Is there something that might serve as a model? I've googled to seach if there's some tool but performance-test farm gave me a lot of info about how to breed cows, pigs and goats on a farm, but that's not very useful in this case I guess. 2) How would you use it? What procedure would you expect? I mean this should produce regular performance test of the current sources (and publish it on some website), but the whole point is to allow developers to do a performance test of their changes before commit to the main. How would you expect to deliver these changes to the farm? How would you define the job? How would you expect to get the results? etc. Just try to write down a list of steps. 3) Any other features expected? If you notice any interesting feature, write it down and note whether it's a 'must have' or a 'nice to have' feature. I really can't promise anything right now - I have just a very rough idea how much time/effort/money this might take. So let's see what is needed to build a 'minimal farm' and if it's feasible with the resources we can get. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance-test farm
Tomas Vondra t...@fuzzy.cz wrote: 1) Is there something that might serve as a model? I've been assuming that we would use the PostgreSQL Buildfarm as a model. http://buildfarm.postgresql.org/ 2) How would you use it? What procedure would you expect? People who had suitable test environments could sign up to periodically build and performance test using the predetermined test suite, and report results back for a consolidated status display. That would spot regressions. It would be nice to have a feature where a proposed patch could be included for a one-time build-and-benchmark run, so that ideas could be tried before commit. It can be hard to anticipate all the differenced between Intel and AMD, Linux and Windows, 32 bit and 64 bit, etc. 3) Any other features expected? Pretty graphs? :-) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Wed, May 11, 2011 at 5:06 PM, David E. Wheeler da...@kineticode.com wrote: On Apr 28, 2011, at 2:16 PM, David E. Wheeler wrote: So maybe it's half-assed. Maybe the version can be anything but the revision must be an integer. Maybe there's a `pg_extension_version($extension_name)` function that returns ARRAY[$version, $revision], and the revision is set in the control file but not included in the version or in the upgrade file names. I think I can live with that. But, hell, you're halfway to mandating the meaning by doing this. Will we have to go the rest of the way in the future? Okay, how we add a revision key to the control file and extrevision to the pg_extension catalog. Its type can be TEXT and is optional for use by extensions. This would allow extension authors to identify the base version of an extension but also the revision. And the core doesn't have to care how it works or if it's used, but it would allow users to know exactly what they have installed. Thoughts? How would pg_extension.extrevision be kept up to date? AFAICS, the whole point is that you might swap out the shared libraries without doing anything at the SQL level. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance-test farm
Dne 11.5.2011 23:41, Kevin Grittner napsal(a): Tomas Vondra t...@fuzzy.cz wrote: 1) Is there something that might serve as a model? I've been assuming that we would use the PostgreSQL Buildfarm as a model. http://buildfarm.postgresql.org/ Yes, I was thinking about that too, but 1) A buildfarm used for regular building / unit testing IMHO may not be the right place to do performance testing (not sure how isolated the benchmarks can be etc.). 2) Not sure how open this might be for the developers (if they could issue their own builds etc.). 3) If this should be part of the current buildfarm, then I'm afraid I can't do much about it. 2) How would you use it? What procedure would you expect? People who had suitable test environments could sign up to periodically build and performance test using the predetermined test suite, and report results back for a consolidated status display. That would spot regressions. So it would be a 'distributed farm'? Not sure it that's a good idea, as to get reliable benchmark results you need a proper environment (not influenced by other jobs, changes of hw etc.). It would be nice to have a feature where a proposed patch could be included for a one-time build-and-benchmark run, so that ideas could be tried before commit. It can be hard to anticipate all the differenced between Intel and AMD, Linux and Windows, 32 bit and 64 bit, etc. Yes, that's one of the main goals - to allow developers to benchmark their patches under various workloads. I don't think we'll be able to get all those configurations, though. 3) Any other features expected? Pretty graphs? :-) Sure. And it will be Web 2.0 ready ;-) Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prefered Types
Excerpts from Tom Lane's message of dom may 08 23:00:27 -0400 2011: For example, if you start noticing an occasional integer overflow that didn't happen before, it might be pretty darn difficult to figure out that the problem is that an operation that was formerly resolved as int4 + int4 is now resolved as int2 + int2. About this particular example, I mentioned earlier that I preliminarly consider the idea of changing the +(int2,int2) operator to return int4 instead of the current int2. It's not impossible that we will miss changing some operators, but my hope is that it won't be that horrible. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XML with invalid chars
On 05/09/2011 11:25 PM, Noah Misch wrote: I see you've gone with doing it unconditionally. I'd lean toward testing the library in pg_xml_init and setting a flag indicating whether we need the extra pass. However, a later patch can always optimize that. I wasn't terribly keen on the idea, but we can look at it again later. Please review and try to break. Here are the test cases I tried: -- caught successfully SELECT E'\x01'::xml; SELECT xmlcomment(E'\x01'); SELECT xmlelement(name foo, xmlattributes(E'\x01' AS bar), ''); SELECT xmlelement(name foo, NULL, E'\x01'); SELECT xmlforest(E'\x01' AS foo); SELECT xmlpi(name foo, E'\x01'); SELECT query_to_xml($$SELECT E'\x01'$$, true, false, ''); -- not caught SELECT xmlroot('root/', version E'\x01'); That's an easy fix. SELECT xmlcomment(E'\ufffe'); That's a bit harder. Do we want to extend these checks to cover surrogates and end of plane characters, which are the remaining forbidden chars? It certainly seems likely to be a somewhat slower test since I think we'd need to process the input strings a Unicode char at a time, but we do that in other places and it seems acceptable. What do people think? -- not directly related, but also wrongly accepted SELECT xmlroot('root/', version ' '); SELECT xmlroot('root/', version 'foo'); Offhand, I don't find libxml2's handling of XML declarations particularly consistent. My copy's xmlCtxtReadDoc() API (used by xml_in when xmloption = document) accepts '?xml version=foo?' but rejects'?xml version= ?'. Its xmlParseBalancedChunkMemory() API (used by xml_in when xmloption = content) accepts anything, even control characters. The XML 1.0 standard is stricter: the version must match ^1\.[0-9]+$. We might want to tighten this at the same time. We can add some stuff to check the version strings. Doesn't seem terribly difficult. libxml2's error message for this case is PCDATA invalid Char value 1 (assuming \x01). Mentioning PCDATA seems redundant, since no other context offers greater freedom. How about: ereport(ERROR, (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), errmsg(invalid XML 1.0 Char \\U%08x, char_val))); That would also mean processing the string a unicode char at a time. So maybe that's what we need to do. Thanks for the input. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance-test farm
Tomas Vondra t...@fuzzy.cz wrote: Dne 11.5.2011 23:41, Kevin Grittner napsal(a): Tomas Vondra t...@fuzzy.cz wrote: 1) Is there something that might serve as a model? I've been assuming that we would use the PostgreSQL Buildfarm as a model. http://buildfarm.postgresql.org/ Yes, I was thinking about that too, but 1) A buildfarm used for regular building / unit testing IMHO may not be the right place to do performance testing (not sure how isolated the benchmarks can be etc.). I'm not saying that we should use the existing buildfarm, or expect current buildfarm machines to support this; just that the pattern of people volunteering hardware in a similar way would be good. 2) Not sure how open this might be for the developers (if they could issue their own builds etc.). I haven't done it, but I understand that you can create a local buildfarm instance which isn't reporting its results. Again, something similar might be good. 3) If this should be part of the current buildfarm, then I'm afraid I can't do much about it. Not part of the current buildfarm; just using a similar overall pattern. Others may have different ideas; I'm just speaking for myself here about what seems like a good idea to me. 2) How would you use it? What procedure would you expect? People who had suitable test environments could sign up to periodically build and performance test using the predetermined test suite, and report results back for a consolidated status display. That would spot regressions. So it would be a 'distributed farm'? Not sure it that's a good idea, as to get reliable benchmark results you need a proper environment (not influenced by other jobs, changes of hw etc.). Yeah, accurate benchmarking is not easy. We would have to make sure people understood that the machine should be dedicated to the benchmark while it is running, which is not a requirement for the buildfarm. Maybe provide some way to annotate HW or OS changes? So if one machine goes to a new kernel and performance changes radically, but other machines which didn't change their kernel continue on a level graph, we'd know to suspect the kernel rather than some change in PostgreSQL code. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and PGPORT
Robert Haas wrote: On Wed, May 11, 2011 at 2:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Or you could just unsetenv instead of complaining. +1 for that. OK, the attached patch does this, but allows PGCLIENTENCODING to be passed in. The new output looks like: Performing Consistency Checks - ignoring libpq environment variable PGPORT Checking old data directory (/u/pgsql.old/data) ok Checking old bin directory (/u/pgsql.old/bin) ok Checking new data directory (/u/pgsql/data) ok Checking new bin directory (/u/pgsql/bin) ok -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/server.c b/contrib/pg_upgrade/server.c new file mode 100644 index 8fce305..bf30dcd *** a/contrib/pg_upgrade/server.c --- b/contrib/pg_upgrade/server.c *** check_for_libpq_envvars(void) *** 254,260 { PQconninfoOption *option; PQconninfoOption *start; - bool found = false; /* Get valid libpq env vars from the PQconndefaults function */ --- 254,259 *** check_for_libpq_envvars(void) *** 264,290 { if (option-envvar) { ! const char *value; /* This allows us to see error messages in the local encoding */ if (strcmp(option-envvar, PGCLIENTENCODING) == 0) continue; - value = getenv(option-envvar); if (value strlen(value) 0) { ! found = true; ! pg_log(PG_WARNING, ! libpq env var %-20s is currently set to: %s\n, option-envvar, value); } } } /* Free the memory that libpq allocated on our behalf */ PQconninfoFree(start); - - if (found) - pg_log(PG_FATAL, - libpq env vars have been found and listed above, please unset them for pg_upgrade\n); } --- 263,287 { if (option-envvar) { ! const char *value = getenv(option-envvar); /* This allows us to see error messages in the local encoding */ if (strcmp(option-envvar, PGCLIENTENCODING) == 0) continue; if (value strlen(value) 0) { ! #ifndef WIN32 ! unsetenv(option-envvar); ! #else ! SetEnvironmentVariableA(option-envvar, ); ! #endif pg_log(PG_WARNING, ! ignoring libpq environment variable %s\n, option-envvar); } } } /* Free the memory that libpq allocated on our behalf */ PQconninfoFree(start); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance-test farm
Dne 12.5.2011 00:21, Kevin Grittner napsal(a): Tomas Vondra t...@fuzzy.cz wrote: Dne 11.5.2011 23:41, Kevin Grittner napsal(a): Tomas Vondra t...@fuzzy.cz wrote: 1) Is there something that might serve as a model? I've been assuming that we would use the PostgreSQL Buildfarm as a model. http://buildfarm.postgresql.org/ Yes, I was thinking about that too, but 1) A buildfarm used for regular building / unit testing IMHO may not be the right place to do performance testing (not sure how isolated the benchmarks can be etc.). I'm not saying that we should use the existing buildfarm, or expect current buildfarm machines to support this; just that the pattern of people volunteering hardware in a similar way would be good. Good point. Actually I was not aware of how the buildfarm works, all I knew was there's something like that because some of the hackers mention a failed build on the mailing list occasionally. So I guess this is a good opportunity to investigate it a bit ;-) Anyway I'm not sure this would give us the kind of environment we need to do benchmarks ... but it's worth to think of. 2) Not sure how open this might be for the developers (if they could issue their own builds etc.). I haven't done it, but I understand that you can create a local buildfarm instance which isn't reporting its results. Again, something similar might be good. Well, yeah. So the developers would get a local 'copy' of all the benchmarks / workloads and could run them? 3) If this should be part of the current buildfarm, then I'm afraid I can't do much about it. Not part of the current buildfarm; just using a similar overall pattern. Others may have different ideas; I'm just speaking for myself here about what seems like a good idea to me. OK, got it. 2) How would you use it? What procedure would you expect? People who had suitable test environments could sign up to periodically build and performance test using the predetermined test suite, and report results back for a consolidated status display. That would spot regressions. So it would be a 'distributed farm'? Not sure it that's a good idea, as to get reliable benchmark results you need a proper environment (not influenced by other jobs, changes of hw etc.). Yeah, accurate benchmarking is not easy. We would have to make sure people understood that the machine should be dedicated to the benchmark while it is running, which is not a requirement for the buildfarm. Maybe provide some way to annotate HW or OS changes? So if one machine goes to a new kernel and performance changes radically, but other machines which didn't change their kernel continue on a level graph, we'd know to suspect the kernel rather than some change in PostgreSQL code. I guess we could run a script that collects all those important parameters and then detect changes. Anyway we still need some 'really stable' machines that are not changed at all, to get a long-term baseline. But I guess that could be done by running some dedicated machines ourselves. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance-test farm
On 05/11/2011 06:21 PM, Kevin Grittner wrote: Tomas Vondrat...@fuzzy.cz wrote: Dne 11.5.2011 23:41, Kevin Grittner napsal(a): Tomas Vondrat...@fuzzy.cz wrote: First up, you guys should be aware that Greg Smith at least is working on this. Let's not duplicate effort. 1) Is there something that might serve as a model? I've been assuming that we would use the PostgreSQL Buildfarm as a model. http://buildfarm.postgresql.org/ Yes, I was thinking about that too, but 1) A buildfarm used for regular building / unit testing IMHO may not be the right place to do performance testing (not sure how isolated the benchmarks can be etc.). I'm not saying that we should use the existing buildfarm, or expect current buildfarm machines to support this; just that the pattern of people volunteering hardware in a similar way would be good. Some buildfarm members might well be suitable for it. I recently added support for running optional steps, and made the SCM module totally generic. Soon I'm hoping to provide for more radical extensibility by having addon modules, which will register themselves with the framework and the have their tests run. I'm currently working on an API for such modules. This was inspired by Mike Fowler's work on a module to test JDBC builds, which his buildfarm member is currently doing: See http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=piapiacdt=2011-05-11%2000%3A00%3A02 for example. Obvious candidate modules might be other client libraries (e.g. perl DBD::Pg), non-committed patches, non-standard tests, and performance testing. 2) Not sure how open this might be for the developers (if they could issue their own builds etc.). I haven't done it, but I understand that you can create a local buildfarm instance which isn't reporting its results. Again, something similar might be good. You can certainly create a client that doesn't report its results (just run it in --test mode). And you can create your own private server (that's been done by at least two organizations I know of). But to test your own stuff, what we really need is a module to run non-committed patches, I think (see above). There buildfarm client does have a mode (--from-source) that lets you test your own stuff and doesn't report on it if you do, but I don't see that it would be useful here. 3) If this should be part of the current buildfarm, then I'm afraid I can't do much about it. Sure you can. Contribute to the efforts mentioned above. Not part of the current buildfarm; just using a similar overall pattern. Others may have different ideas; I'm just speaking for myself here about what seems like a good idea to me. The buildfarm server is a pretty generic reporting framework. Sure we can build another. But it seems a bit redundant. 2) How would you use it? What procedure would you expect? People who had suitable test environments could sign up to periodically build and performance test using the predetermined test suite, and report results back for a consolidated status display. That would spot regressions. So it would be a 'distributed farm'? Not sure it that's a good idea, as to get reliable benchmark results you need a proper environment (not influenced by other jobs, changes of hw etc.). You are not going to get a useful performance farm except in a distributed way. We don't own any labs, nor have we any way of assembling the dozens or hundreds of machines to represent the spectrum of platforms that we want tested in one spot. Knowing that we have suddenly caused a performance regression on, say, FreeBSD 8.1 running on AMD64, is a critical requirement. Yeah, accurate benchmarking is not easy. We would have to make sure people understood that the machine should be dedicated to the benchmark while it is running, which is not a requirement for the buildfarm. Maybe provide some way to annotate HW or OS changes? So if one machine goes to a new kernel and performance changes radically, but other machines which didn't change their kernel continue on a level graph, we'd know to suspect the kernel rather than some change in PostgreSQL code. Indeed, there are lots of moving pieces. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XML with invalid chars
On Wed, May 11, 2011 at 06:17:07PM -0400, Andrew Dunstan wrote: On 05/09/2011 11:25 PM, Noah Misch wrote: SELECT xmlcomment(E'\ufffe'); That's a bit harder. Do we want to extend these checks to cover surrogates and end of plane characters, which are the remaining forbidden chars? It certainly seems likely to be a somewhat slower test since I think we'd need to process the input strings a Unicode char at a time, but we do that in other places and it seems acceptable. What do people think? My thinking was that we should only make this flag day for xml-type users if we're going to fix it all the way. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Help: regarding patch development
Hello, Sir, i want to develop a service for postgresql related to querry processing.but i dont know how to develop it. plz guide me so that i can take step. i will be realy thankful to you. Regards Emman
Re: [HACKERS] XML with invalid chars
On 05/11/2011 07:00 PM, Noah Misch wrote: On Wed, May 11, 2011 at 06:17:07PM -0400, Andrew Dunstan wrote: On 05/09/2011 11:25 PM, Noah Misch wrote: SELECT xmlcomment(E'\ufffe'); That's a bit harder. Do we want to extend these checks to cover surrogates and end of plane characters, which are the remaining forbidden chars? It certainly seems likely to be a somewhat slower test since I think we'd need to process the input strings a Unicode char at a time, but we do that in other places and it seems acceptable. What do people think? My thinking was that we should only make this flag day for xml-type users if we're going to fix it all the way. Fair enough. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance-test farm
* Andrew Dunstan (and...@dunslane.net) wrote: First up, you guys should be aware that Greg Smith at least is working on this. Let's not duplicate effort. Indeed. I'm also interested in making this happen and have worked with Greg in the past on it. There's even some code out there that we developed to add it on to the buildfarm, though that needs to be reworked to fit with Andrew's latest changes (which are all good changes). We need a bit of hardware, but more, we need someone to clean up the code, get it all integrated, and make it all work and report useful information. My feeling is if you build it, they will come with regard to the hardware/performance machines. Thanks, Stephen signature.asc Description: Digital signature