[HACKERS] Buffer statistics for pg_stat_statements
We have infrastructure to count numbers buffer access in 8.5 Alpha 3. I'd like to add per-query buffer usage into contrib/pg_stat_statements. The pg_stat_statements view will have the same contents with struct BufferUsage. Fields named shared_blks_{hit|read|written}, local_blks_{hit|read|written}, and temp_blks_{read|written} will be added to the view. We can determine slow queries not only based on durations but also based on I/O or memory access count. Also, queries with non-zero temp_blks_read means DBA need to consider increasing work_mem. Those information would be useful to find where the server's bottleneck is. Additional management costs cannot be avoided, but I think it should be not so high because we accumulate buffer usage only once per query, while EXPLAIN BUFFERS is slow because we need per-tuple calculation. I'll submit this pg_stat_statements enhancement to the next commit fest. Comments welcome. Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
(2009/12/18 15:48), Takahiro Itagaki wrote: Robert Haasrobertmh...@gmail.com wrote: In both cases, I'm lost. Help? They might be contrasted with the comments for myLargeObjectExists. Since we use MVCC visibility in loread(), metadata for large object also should be visible in MVCC rule. If I understand them, they say: * pg_largeobject_aclmask_snapshot requires a snapshot which will be used in loread(). * Don't use LargeObjectExists if you need MVCC visibility. Yes, correct. In acldefault(), there is this comment: /* Grant SELECT,UPDATE by default, for now */ This doesn't seem to match what the code is doing, so I think we should remove it. Ah, ACL_NO_RIGHTS is the default. Oops, it reflects very early phase design, but fixed later. I also notice that dumpBlobComments() is now misnamed, but it seems we've chosen to add a comment mentioning that fact rather than fixing it. Hmmm, now it dumps not only comments but also ownership of large objects. Should we rename it dumpBlobMetadata() or so? It seems to me quite natural. The attached patch fixes them. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei kai...@ak.jp.nec.com *** base/src/backend/utils/adt/acl.c (revision 2503) --- base/src/backend/utils/adt/acl.c (working copy) *** *** 765,771 owner_default = ACL_ALL_RIGHTS_LANGUAGE; break; case ACL_OBJECT_LARGEOBJECT: - /* Grant SELECT,UPDATE by default, for now */ world_default = ACL_NO_RIGHTS; owner_default = ACL_ALL_RIGHTS_LARGEOBJECT; break; --- 765,770 *** base/src/bin/pg_dump/pg_dump.h (revision 2503) --- base/src/bin/pg_dump/pg_dump.h (working copy) *** *** 116,122 DO_FOREIGN_SERVER, DO_DEFAULT_ACL, DO_BLOBS, ! DO_BLOB_COMMENTS } DumpableObjectType; typedef struct _dumpableObject --- 116,122 DO_FOREIGN_SERVER, DO_DEFAULT_ACL, DO_BLOBS, ! DO_BLOB_METADATA } DumpableObjectType; typedef struct _dumpableObject *** base/src/bin/pg_dump/pg_dump_sort.c (revision 2503) --- base/src/bin/pg_dump/pg_dump_sort.c (working copy) *** *** 56,62 4, /* DO_FOREIGN_SERVER */ 17, /* DO_DEFAULT_ACL */ 10, /* DO_BLOBS */ ! 11 /* DO_BLOB_COMMENTS */ }; /* --- 56,62 4, /* DO_FOREIGN_SERVER */ 17, /* DO_DEFAULT_ACL */ 10, /* DO_BLOBS */ ! 11 /* DO_BLOB_METADATA */ }; /* *** *** 93,99 15, /* DO_FOREIGN_SERVER */ 27, /* DO_DEFAULT_ACL */ 20, /* DO_BLOBS */ ! 21 /* DO_BLOB_COMMENTS */ }; --- 93,99 15, /* DO_FOREIGN_SERVER */ 27, /* DO_DEFAULT_ACL */ 20, /* DO_BLOBS */ ! 21 /* DO_BLOB_METADATA */ }; *** *** 1151,1159 BLOBS (ID %d), obj-dumpId); return; ! case DO_BLOB_COMMENTS: snprintf(buf, bufsize, ! BLOB COMMENTS (ID %d), obj-dumpId); return; } --- 1151,1159 BLOBS (ID %d), obj-dumpId); return; ! case DO_BLOB_METADATA: snprintf(buf, bufsize, ! BLOB METADATA (ID %d), obj-dumpId); return; } *** base/src/bin/pg_dump/pg_dump.c (revision 2503) --- base/src/bin/pg_dump/pg_dump.c (working copy) *** *** 191,197 static const char *fmtQualifiedId(const char *schema, const char *id); static bool hasBlobs(Archive *AH); static int dumpBlobs(Archive *AH, void *arg); ! static int dumpBlobComments(Archive *AH, void *arg); static void dumpDatabase(Archive *AH); static void dumpEncoding(Archive *AH); static void dumpStdStrings(Archive *AH); --- 191,197 static const char *fmtQualifiedId(const char *schema, const char *id); static bool hasBlobs(Archive *AH); static int dumpBlobs(Archive *AH, void *arg); ! static int dumpBlobMetadata(Archive *AH, void *arg); static void dumpDatabase(Archive *AH); static void dumpEncoding(Archive *AH); static void dumpStdStrings(Archive *AH); *** *** 707,716 blobobj-name = strdup(BLOBS); blobcobj = (DumpableObject *) malloc(sizeof(DumpableObject)); ! blobcobj-objType = DO_BLOB_COMMENTS; blobcobj-catId = nilCatalogId; AssignDumpId(blobcobj); ! blobcobj-name = strdup(BLOB COMMENTS); addObjectDependency(blobcobj, blobobj-dumpId); } --- 707,716 blobobj-name = strdup(BLOBS); blobcobj = (DumpableObject *) malloc(sizeof(DumpableObject)); ! blobcobj-objType = DO_BLOB_METADATA; blobcobj-catId = nilCatalogId; AssignDumpId(blobcobj); ! blobcobj-name = strdup(BLOB METADATA); addObjectDependency(blobcobj, blobobj-dumpId); } *** *** 2048,2064 } /* ! * dumpBlobComments ! * dump all blob properties. ! * It has BLOB COMMENTS tag due to the historical reason, but note ! * that it is the routine to dump all the properties of blobs. * * Since we don't provide any way to
Re: [HACKERS] Buffer statistics for pg_stat_statements
Le vendredi 18 décembre 2009 09:44:40, Takahiro Itagaki a écrit : We have infrastructure to count numbers buffer access in 8.5 Alpha 3. I'd like to add per-query buffer usage into contrib/pg_stat_statements. The pg_stat_statements view will have the same contents with struct BufferUsage. Fields named shared_blks_{hit|read|written}, local_blks_{hit|read|written}, and temp_blks_{read|written} will be added to the view. We can determine slow queries not only based on durations but also based on I/O or memory access count. Also, queries with non-zero temp_blks_read means DBA need to consider increasing work_mem. Those information would be useful to find where the server's bottleneck is. Additional management costs cannot be avoided, but I think it should be not so high because we accumulate buffer usage only once per query, while EXPLAIN BUFFERS is slow because we need per-tuple calculation. I'll submit this pg_stat_statements enhancement to the next commit fest. Comments welcome. Very good idea. Perhaps it can be usefull to have the percentage for hit/read ratio computed in the view ? Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Cédric Villemain Administrateur de Base de Données Cel: +33 (0)6 74 15 56 53 http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] An example of bugs for Hot Standby
This way we only cancel direct deadlocks. It doesn't solve general problem of buffer waits, but they may be solvable by different mechanism. Following question may be redundant. Just a confirmation. Deadlock example is catstrophic while it's rather a rare event. On the other hand, LockBufferForCleanup() can cause another problem. * One idle pin-holder backend can freeze startup process(). This problem is not catstrophic, but it seems a similar problem which StandbyAcquireAccessExclusiveLock() tries to avoid. ...Is this the problem you call general problem above ? regards, -- Hiroyuki YAMADA Kokolink Corporation yam...@kokolink.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On ons, 2009-12-16 at 11:28 -0800, David E. Wheeler wrote: I just realized that this was easy to do, and despite my complete lack of C skillz was able to throw this together in a couple of hours. It might be handy to some, though the possible downsides are: * No json_to_hstore(). * Leads to requests for hstore_to_yaml(), hstore_to_xml(), etc. * Andrew Gierth said “no” when I suggested it. But it's kind of handy, too. Thoughts? Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast or immediate shutdown
On ons, 2009-12-16 at 15:42 +, Simon Riggs wrote: I suggest we say smoothed when checkpoint option is !immediate. So we will remove the word immediate from the message. The opposite of smooth could be sharp. :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY IN as SELECT target
Pavel Stehule wrote: 2009/12/17 Andrew Dunstan and...@dunslane.net: Recently there was discussion about allowing a COPY statement to be a SELECT target, returning a text array, although the syntax wasn't really nailed down that I recall. I was thinking that we might have COPY RETURNING ARRAY FROM ... instead of COPY tablename opt_column_list FROM ... the we possibly could do things like: SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY FROM STDIN CSV) as t; Thoughts? In this case copy doesn't return array - so RETURNING ARRAY is little bit strange. what SELECT .. FROM (COPY VALUES [(colums)] FROM ) You are misunderstanding what I want to provide, which is that it *does* return an array of text for each line. That was what the previous discussion arrived at, and is illustrated in the example I showed above. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
2009/12/18 KaiGai Kohei kai...@ak.jp.nec.com: (2009/12/18 15:48), Takahiro Itagaki wrote: Robert Haasrobertmh...@gmail.com wrote: In both cases, I'm lost. Help? They might be contrasted with the comments for myLargeObjectExists. Since we use MVCC visibility in loread(), metadata for large object also should be visible in MVCC rule. If I understand them, they say: * pg_largeobject_aclmask_snapshot requires a snapshot which will be used in loread(). * Don't use LargeObjectExists if you need MVCC visibility. Yes, correct. In acldefault(), there is this comment: /* Grant SELECT,UPDATE by default, for now */ This doesn't seem to match what the code is doing, so I think we should remove it. Ah, ACL_NO_RIGHTS is the default. Oops, it reflects very early phase design, but fixed later. I also notice that dumpBlobComments() is now misnamed, but it seems we've chosen to add a comment mentioning that fact rather than fixing it. Hmmm, now it dumps not only comments but also ownership of large objects. Should we rename it dumpBlobMetadata() or so? It seems to me quite natural. The attached patch fixes them. I think we might want to go with dumpBlobProperties(), because dumpBlobMetadata() might lead you to think that all of the properties being dumped are stored in pg_largeobject_metadata, which is not the case. I do also wonder why we are calling these blobs in this code rather than large objects, but that problem predates this patch and I think we might as well leave it alone for now. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
* Florian Pflug: On 16.12.09 16:40 , Kevin Grittner wrote: Nicolas Barbiernicolas.barb...@gmail.com wrote: I am not sure whether the serialization failures that it may cause are dependent on the plan used. They are. But so are failures due to deadlocks even today, no? They are detected. In this context, serialization failure means that PostgreSQL generates a history which lacks one-copy serializability, without reporting any errors. (In the general case, the unique constraint violation which bugs me personally is a different beast and does result in an error.) -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
Fujii Masao wrote: pg_stop_backup deletes the previous backup history file from pg_xlog. So replication of a backup history file would fail if just one new online-backup is caused after the base-backup for the standby is taken. This is too aggressive deletion policy for Streaming Replication, I think. So I'd like to change pg_stop_backup so as to delete only backup history files of four or more generations ago (four is enough?). This is essentially the same problem we have with WAL files and checkpoints. If the standby falls behind too much, without having on open connection to the master all the time, the master will delete old files that are still needed in the standby. I don't think it's worthwhile to modify pg_stop_backup() like that. We should address the general problem. At the moment, you're fine if you also configure WAL archiving and log file shipping, but it would be nice to have some simpler mechanism to avoid the problem. For example, a GUC in master to retain all log files (including backup history files) for X days. Or some way for to register the standby with the master so that the master knows it's out there, and still needs the logs, even when it's not connected. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Fri, Dec 18, 2009 at 11:03 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Or some way for to register the standby with the master so that the master knows it's out there, and still needs the logs, even when it's not connected. That is the real answer, I think. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text? json_data_type++ D -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
2009/12/18 Florian Weimer fwei...@bfk.de: * Florian Pflug: On 16.12.09 16:40 , Kevin Grittner wrote: Nicolas Barbiernicolas.barb...@gmail.com wrote: I am not sure whether the serialization failures that it may cause are dependent on the plan used. They are. But so are failures due to deadlocks even today, no? They are detected. In this context, serialization failure means that PostgreSQL generates a history which lacks one-copy serializability, without reporting any errors. (In the general case, the unique constraint violation which bugs me personally is a different beast and does result in an error.) FYI (hoping to avoid confusion): When I used the term serialization failure above, I surely meant the kind of failures that would be detected by the new optimistic algorithm. I would guess that currently, whether deadlocks can be triggered by a set of transactions (i.e., sequences of SQL statements) depends on the plan only marginally*. E.g., if two plans happen to use the same index, rows may always get locked in the same order by FOR UPDATE, thus preventing certain deadlocks; if the plans were those deadlocks might become possible. Therefore, I don't think that it is currently very typical for plan-changes to trigger a massive change in the number of deadlocks that happen. The new method might change that property. This instability problem is often seen on DBMSs that use 2PL on blocks read or rows inspected as their main concurrency control mechanism (e.g., MS-SQL). It is mostly not seen on DBMSs that use MVCC (because no locks are taken that depend on the peculiarities of the plan; see caveat above at [*]), and would also not occur when one would use the most literal implementation of predicate locking (because the locks taken only depend on the SQL statements' conditions and not on the plan). Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Distinguish view and table problem
Hi! I just want to distinguish a view and a table while postgres execute exec_simple_query(). In the systable of pg_class, a view and a table has different relkind ('r' 'v'). But when I print the parsetree and the rewrite parsetree, I found that a view and a table has no character to distinguish because the structure Relation has no attribute called relkind. Maybe I should read systable to confirm that we are select from a view or table? But there's still has problem. How could I get the relkind of a table(view) by its name from pg_class? Another question is that does postgres save the user's original query_string at anywhere(systable etc.)? If I want to save the sourceText in the systable, I could add a column to pg_class called query_string. How could I insert a line to pg_class or read a line from it? Thank you very much! 2009-12-18 suzhiyang
Re: [HACKERS] Distinguish view and table problem
2009/12/18 suzhiyang suzhiy...@gmail.com How could I get the relkind of a table(view) by its name from pg_class? pg_class is (quite logically) UNIQUE on (relname, relnamespace) SELECT c.relkind from pg_class c, pg_namespace n where c.relnamespace = n.oid and c.relname = 'thetable' and n.nspname = 'theschema' -- Filip Rembiałkowski JID,mailto:filip.rembialkow...@gmail.com http://filip.rembialkowski.net/
Re: [HACKERS] PATCH: Add hstore_to_json()
On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler da...@kineticode.com wrote: On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text? json_data_type++ What would that do for us? I'm not opposed to it, but it seems like the more important thing would be to provide functions or operators that can do things like extract an array, extract a hash key, identify whether something is a hash, list, or scalar, etc. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On 18.12.09 16:42 , Florian Weimer wrote: * Florian Pflug: On 16.12.09 16:40 , Kevin Grittner wrote: Nicolas Barbiernicolas.barb...@gmail.com wrote: I am not sure whether the serialization failures that it may cause are dependent on the plan used. They are. But so are failures due to deadlocks even today, no? They are detected. In this context, serialization failure means that PostgreSQL generates a history which lacks one-copy serializability, without reporting any errors. No, the whole point of this SIREAD-lock technique is to prevent that once and for all, and make SERIALIZABLE transaction really serializable (or fail with a serialization error). (In the general case, the unique constraint violation which bugs me personally is a different beast and does result in an error.) I'm not sure I understand what you are referring to here. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Dec 18, 2009, at 8:51 AM, Robert Haas wrote: What would that do for us? I'm not opposed to it, but it seems like the more important thing would be to provide functions or operators that can do things like extract an array, extract a hash key, identify whether something is a hash, list, or scalar, etc. Such things would be included with such a data type, no? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] question about implementing XA-ish functions
I'm trying to implement a function that has some XA like properties. Is it possible to write a postgres extension function that fires when called within a pg transaction... however, the actions it takes need to be later committed or rolled back based on the containing transactions commital or not. Not having looked to deeply into this, I'm wondering if this is possible. Naively, my first hookpoint would be something like: allocate something in the transactions memory context and register a cleanup do my work. when the transaction memory context is cleaned up, my cleanup handler fires, I detect whether the txn was committed or rolledback and rightly mark my work as committed or rolled back. Thoughts? -- Theo Schlossnagle http://omniti.com/is/theo-schlossnagle p: +1.443.325.1357 x201 f: +1.410.872.4911 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Time to run initdb is mostly figure-out-the-timezone work
On current Fedora 11, there is a huge difference in initdb time if you have TZ set versus if you don't: I get about 18 seconds versus less than four. $ time initdb ... blah blah blah ... real0m17.953s user0m6.490s sys 0m10.935s $ rm -rf $PGDATA $ export TZ=GMT $ time initdb ... blah blah blah ... real0m3.767s user0m2.997s sys 0m0.784s $ The reason for this is that initdb launches the postmaster many times (at least 14) and each one of those launches results in a search of every file in the timezone database, if we don't have a TZ value to let us identify the timezone immediately. Now this hardly matters to end users who seldom do initdb, but from a developer's perspective it would be awfully nice if initdb took less time. If other people can reproduce similar behavior, I think it would be worth the trouble to have initdb forcibly set the TZ or PGTZ variable while it runs. AFAIK it does not matter what timezone environment postgres sees during initdb; we don't put that into the config file. It'd be about a one-line addition ... Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On 18.12.09 17:33 , Nicolas Barbier wrote: I would guess that currently, whether deadlocks can be triggered by a set of transactions (i.e., sequences of SQL statements) depends on the plan only marginally*. E.g., if two plans happen to use the same index, rows may always get locked in the same order by FOR UPDATE, thus preventing certain deadlocks; if the plans were those deadlocks might become possible. Therefore, I don't think that it is currently very typical for plan-changes to trigger a massive change in the number of deadlocks that happen. The new method might change that property. Hm, I think that's true if you assume that most application issue pretty complex SELECT statements but only either pretty simple UPDATEs/DELETEs, or complex ones but only seldomly. Once you start hitting a table with a lot of concurrent UPDATEs/DELETES involving joins and subselects, the picture changes considerably I think. I must admit, however, that it's hard to imagine a real application that actually does this, though... But actually, now that I think about this, I fail to see why the false-positive serialization error the SIREAD-lock approach generates would depend on the plan. The existance or non-existance of rw dependencies does *not* depend on whether the read or write *physically* happens first, only on their logical ordering (T1 read an item that T2 changed, but T2 did not commit before T1 took it's snapshot). Plus, the way I read the thesis, the false positives of the SIREAD-lock approach has nothing to do with the SIREAD locks per se. They are introduced by the approximate way in which circles contained in the transaction's dependency graph are detected (the inConflict, outConflict business). best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work
Le 18/12/2009 18:07, Tom Lane a écrit : On current Fedora 11, there is a huge difference in initdb time if you have TZ set versus if you don't: I get about 18 seconds versus less than four. $ time initdb ... blah blah blah ... real0m17.953s user0m6.490s sys 0m10.935s $ rm -rf $PGDATA $ export TZ=GMT $ time initdb ... blah blah blah ... real0m3.767s user0m2.997s sys 0m0.784s $ The reason for this is that initdb launches the postmaster many times (at least 14) and each one of those launches results in a search of every file in the timezone database, if we don't have a TZ value to let us identify the timezone immediately. Now this hardly matters to end users who seldom do initdb, but from a developer's perspective it would be awfully nice if initdb took less time. If other people can reproduce similar behavior, I think it would be worth the trouble to have initdb forcibly set the TZ or PGTZ variable while it runs. I have the exact same issue: guilla...@laptop:~$ time initdb Les fichiers de ce cluster appartiendront à l'utilisateur « guillaume ». [...] real0m7.972s user0m3.588s sys 0m3.444s guilla...@laptop:~$ export TZ=GMT guilla...@laptop:~$ rm -rf t1 guilla...@laptop:~$ time initdb [...] real0m1.828s user0m1.436s sys 0m0.368s This is on Ubuntu 9.10. Quite impressive. I think I'll add an alias (alias initdb=TZ=GMT initdb). -- Guillaume. http://www.postgresqlfr.org http://dalibo.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On 18.12.09 17:05 , Robert Haas wrote: On Fri, Dec 18, 2009 at 11:03 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Or some way for to register the standby with the master so that the master knows it's out there, and still needs the logs, even when it's not connected. That is the real answer, I think. It'd prefer if the slave could automatically fetch a new base backup if it falls behind too far to catch up with the available logs. That way, old logs don't start piling up on the server if a slave goes offline for a long time. The slave could for example run a configurable shell script in that case, for example. You could then use that to rsync the data directory from the server (after a pg_start_backup() of course). best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Fri, Dec 18, 2009 at 12:22 PM, Florian Pflug fgp.phlo@gmail.com wrote: On 18.12.09 17:05 , Robert Haas wrote: On Fri, Dec 18, 2009 at 11:03 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Or some way for to register the standby with the master so that the master knows it's out there, and still needs the logs, even when it's not connected. That is the real answer, I think. It'd prefer if the slave could automatically fetch a new base backup if it falls behind too far to catch up with the available logs. That way, old logs don't start piling up on the server if a slave goes offline for a long time. The slave could for example run a configurable shell script in that case, for example. You could then use that to rsync the data directory from the server (after a pg_start_backup() of course). That would be nice to have too, but it's almost certainly much harder to implement. In particular, there's no hard and fast rule for figuring out when you've dropped so far behind that resnapping the whole thing is faster than replaying the WAL bit by bit. And, of course, you'll have to take the standby down if you go that route, whereas trying to catch up the WAL lets it stay up throughout the process. I think (as I did/do with Hot Standby) that the most important thing here is to get to a point where we have a reasonably good feature that is of some use, and commit it. It will probably have some annoying limitations; we can remove those later. I have a feel that what we have right now is going to be non-robust in the face of network breaks, but that is a problem that can be fixed by a future patch. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Distinguish view and table problem
On Fri, Dec 18, 2009 at 10:57 AM, suzhiyang suzhiy...@gmail.com wrote: Another question is that does postgres save the user's original query_string at anywhere(systable etc.)? If I want to save the sourceText in the systable, I could add a column to pg_class called query_string. How could I insert a line to pg_class or read a line from it? pg_class wouldn't make much sense for this. But you might be interested in pg_stat_activity. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work
On Fri, Dec 18, 2009 at 06:20:39PM +0100, Guillaume Lelarge wrote: Le 18/12/2009 18:07, Tom Lane a écrit : On current Fedora 11, there is a huge difference in initdb time if you have TZ set versus if you don't: I get about 18 seconds versus less than four. I have the exact same issue: For whatever it's worth, I get it too, on Ubuntu 9.04... ~4s without TZ vs. ~1.8s with TZ. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] Distinguish view and table problem
suzhiyang suzhiy...@gmail.com writes: I just want to distinguish a view and a table while postgres execute exec_simple_query(). In the systable of pg_class, a view and a table has different relkind ('r' 'v'). But when I print the parsetree and the rewrite parsetree, I found that a view and a table has no character to distinguish because the structure Relation has no attribute called relkind. See rel-rd_rel-relkind ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work
Tom Lane wrote: On current Fedora 11, there is a huge difference in initdb time if you have TZ set versus if you don't: I get about 18 seconds versus less than four. Wow, I can reproduce this (11-12 secs when no TZ versus 5 when TZ is defined). I'd never noticed because I normally have TZ set; but yes I agree that this is worthwhile. I notice that most of the difference is system time ... I imagine we do a lot of syscalls to guess the timezone. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Florian Weimer fwei...@bfk.de wrote: * Florian Pflug: On 16.12.09 16:40 , Kevin Grittner wrote: Nicolas Barbiernicolas.barb...@gmail.com wrote: I am not sure whether the serialization failures that it may cause are dependent on the plan used. They are. But so are failures due to deadlocks even today, no? They are detected. In this context, serialization failure means that PostgreSQL generates a history which lacks one-copy serializability, without reporting any errors. (In the general case, the unique constraint violation which bugs me personally is a different beast and does result in an error.) I don't understand what you're saying here. Could you rephrase or expand on this? Thanks, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work
Joshua Tolley eggyk...@gmail.com writes: On Fri, Dec 18, 2009 at 06:20:39PM +0100, Guillaume Lelarge wrote: Le 18/12/2009 18:07, Tom Lane a écrit : On current Fedora 11, there is a huge difference in initdb time if you have TZ set versus if you don't: I get about 18 seconds versus less than four. I have the exact same issue: For whatever it's worth, I get it too, on Ubuntu 9.04... ~4s without TZ vs. ~1.8s with TZ. BTW, I just realized that it makes a difference that I customarily use the configure option --with-system-tzdata=/usr/share/zoneinfo on that machine. I do it mainly because it saves a few seconds during make install, but also because Red Hat's PG packages use that option so I want to test it regularly. The impact of this is that the TZ search also has to scan through a bunch of leap-second-aware timezone files, which are not present in a default PG build's timezone tree. So that probably explains why I see a 4x slowdown while you get more like 2x. Still, it seems worth doing something about, if it's as easy as a one-line addition. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work
Alvaro Herrera alvhe...@commandprompt.com writes: I notice that most of the difference is system time ... I imagine we do a lot of syscalls to guess the timezone. Yeah, it seems to be mostly the cost of searching the timezone directory tree and reading all those small files. I was led to notice this because Red Hat's latest devel kernels seem to have a bit of a performance regression in this area: https://bugzilla.redhat.com/show_bug.cgi?id=548403 Obviously there's something there for the kernel guys to fix, but even with a non-borked kernel it's an expensive thing to do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
Robert Haas wrote: On Fri, Dec 18, 2009 at 12:22 PM, Florian Pflug fgp.phlo@gmail.com wrote: On 18.12.09 17:05 , Robert Haas wrote: On Fri, Dec 18, 2009 at 11:03 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Or some way for to register the standby with the master so that the master knows it's out there, and still needs the logs, even when it's not connected. That is the real answer, I think. It'd prefer if the slave could automatically fetch a new base backup if it falls behind too far to catch up with the available logs. That way, old logs don't start piling up on the server if a slave goes offline for a long time. The slave could for example run a configurable shell script in that case, for example. You could then use that to rsync the data directory from the server (after a pg_start_backup() of course). That would be nice to have too, Yeah, for small databases, it's probably a better tradeoff. The problem with keeping WAL around in the master indefinitely is that you will eventually run out of disk space if the standby disappears for too long. but it's almost certainly much harder to implement. In particular, there's no hard and fast rule for figuring out when you've dropped so far behind that resnapping the whole thing is faster than replaying the WAL bit by bit. I'd imagine that you take a new base backup only if you have to, ie. the old WAL files the slave needs have already been deleted from the master. And, of course, you'll have to take the standby down if you go that route, whereas trying to catch up the WAL lets it stay up throughout the process. Good point. I think (as I did/do with Hot Standby) that the most important thing here is to get to a point where we have a reasonably good feature that is of some use, and commit it. It will probably have some annoying limitations; we can remove those later. I have a feel that what we have right now is going to be non-robust in the face of network breaks, but that is a problem that can be fixed by a future patch. Agreed. About a year ago, I was vocal about not relying on the file based shipping, but I don't have a problem with relying on it as an intermediate step, until we add the other options. It's robust as it is, if you set up WAL archiving. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about implementing XA-ish functions
Theo Schlossnagle wrote: I'm trying to implement a function that has some XA like properties. Is it possible to write a postgres extension function that fires when called within a pg transaction... however, the actions it takes need to be later committed or rolled back based on the containing transactions commital or not. Not having looked to deeply into this, I'm wondering if this is possible. Naively, my first hookpoint would be something like: allocate something in the transactions memory context and register a cleanup do my work. when the transaction memory context is cleaned up, my cleanup handler fires, I detect whether the txn was committed or rolledback and rightly mark my work as committed or rolled back. See RegisterXactCallback(). And then there's the ResourceOwners, that you can use to register custom resources for cleanup. Of course, you'll never be able to make it atomic without 2PC. The callbacks are executed very soon after after the commit record has been flushed to disk, so the window is small but it's there. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Robert Haas wrote: On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler da...@kineticode.com wrote: On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text? json_data_type++ What would that do for us? I'm not opposed to it, but it seems like the more important thing would be to provide functions or operators that can do things like extract an array, extract a hash key, identify whether something is a hash, list, or scalar, etc. In principle it's not a bad idea to have a JSON type for several reasons. First, it's a better match than hstore for serializing an arbitrary tuple, because unlike hstore it can have nested arrays and composites, just as tuples can. Second, it might well be very useful if we could easily return results as JSON to AJAX applications, which are increasingly becoming the norm. And similarly we might be able to reduce application load if Postgres could perform operations on JSON, rather than having to return it all to the client to process. I think it would be useful if someone produced a JSON module as, say, a pgFoundry project, to start with, and we would then be better able to assess its usefulness. An interesting question would be how one might sanely index such things. You're correct that we don't necessarily need a new type, we could just make it text and have a bunch of operations, but that seems to violate the principle of data type abstraction a bit. If the operations can be sure that the object is valid JSON they could skip a bunch of sanity checks that they would otherwise need to do if just handed an arbitrary piece of text. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Closing out CommitFest 2009-11
On Tue, Dec 15, 2009 Greg Smith wrote: Sounds like we just are waiting for Simon to finish up, which is expected to happen by tomorrow, and for Tom to wrap up working on the ProcessUtility_hook. That makes the first reasonable date to consider alpha3 packaging Thursday 12/17 I think. Update: I just nagged Simon about this some more, as I know everyone is waiting for this and he's too deep in the code to be real talkative. It sounds like triage around the issues raised in the An example of bugs for Hot Standby thread is finished at this point, and he's still hopeful to get this wrapped up and committed this week--which given this is Friday means real soon now. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com
Re: [HACKERS] question about implementing XA-ish functions
This is perfect. It fires on both commit and rollback? And I can determine which? The system I'm interfacing with has 2PC so it should be a pretty tight fit. Thanks a ton Heikki! -- Theo Schlossnagle (mobile) http://omniti.com/is/theo-schlossnagle On Dec 18, 2009, at 10:34 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Theo Schlossnagle wrote: I'm trying to implement a function that has some XA like properties. Is it possible to write a postgres extension function that fires when called within a pg transaction... however, the actions it takes need to be later committed or rolled back based on the containing transactions commital or not. Not having looked to deeply into this, I'm wondering if this is possible. Naively, my first hookpoint would be something like: allocate something in the transactions memory context and register a cleanup do my work. when the transaction memory context is cleaned up, my cleanup handler fires, I detect whether the txn was committed or rolledback and rightly mark my work as committed or rolled back. See RegisterXactCallback(). And then there's the ResourceOwners, that you can use to register custom resources for cleanup. Of course, you'll never be able to make it atomic without 2PC. The callbacks are executed very soon after after the commit record has been flushed to disk, so the window is small but it's there. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Andrew Dunstan and...@dunslane.net writes: You're correct that we don't necessarily need a new type, we could just make it text and have a bunch of operations, but that seems to violate the principle of data type abstraction a bit. I think the relevant precedent is that we have an xml type. While I surely don't want to follow the SQL committee's precedent of inventing a ton of special syntax for xml support, it might be useful to look at that for suggestions of what functionality would be useful for a json type. [ I can already hear somebody insisting on a yaml type :-( ] regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Removing pg_migrator limitations
There are several pg_migrator limitations that appeared late in the 8.4 development cycle and were impossible to fix at that point. I would like to fix them for Postgres 8.5: o a user-defined composite data type o a user-defined array data type o a user-defined enum data type I have discussed this with Alvaro. I think pg_migrator needs the ability to set the pg_type.oid and pg_enum.oid for user-defined composites, arrays, and enums to match the values in the old server, and hence match references to those rows in user data tables. The general solution will involve creating place-hold rows in pg_type and pg_enum with the desired oids, and deleting those placeholder rows at the time pg_dump creates the new type or enum, and passing the desired oid to the creation command. We do something similar for toast tables now, but it is easier there because the oids are actually file system files. There is no ability to specify an OID column value on insert. However, pg_migrator has the ability to call backend C functions via shared library functions so it could potentially insert the needed system catalog dummy rows. As far as creating rows with the proper oids, we could modify the SQL grammar to allow it, or modify DefineType() so it accepts oids and passes them to TypeCreate(), or a simpler approach would be to set the oid counter before calling CREATE TYPE, but that would be error-prone because other oids might be assigned in indeterminate order --- we removed that code from pg_migrator for toast tables before 8.4 shipped, so I am not excited to re-add it. The same approach is necessary for enums. Another approach could be to create the dummy rows, load all of the pg_dump schema, then renumber the rows to the proper oids, but this assumes that I will be able to find all references to the current oids and renumber those too. Seems I need some help here. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: You're correct that we don't necessarily need a new type, we could just make it text and have a bunch of operations, but that seems to violate the principle of data type abstraction a bit. I think the relevant precedent is that we have an xml type. While I surely don't want to follow the SQL committee's precedent of inventing a ton of special syntax for xml support, it might be useful to look at that for suggestions of what functionality would be useful for a json type. [ I can already hear somebody insisting on a yaml type :-( ] Now that's a case where I think a couple of converter functions at most should meet the need. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Andrew Dunstan wrote: Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: You're correct that we don't necessarily need a new type, we could just make it text and have a bunch of operations, but that seems to violate the principle of data type abstraction a bit. I think the relevant precedent is that we have an xml type. While I surely don't want to follow the SQL committee's precedent of inventing a ton of special syntax for xml support, it might be useful to look at that for suggestions of what functionality would be useful for a json type. [ I can already hear somebody insisting on a yaml type :-( ] Now that's a case where I think a couple of converter functions at most should meet the need. I can see this feature getting web developers more excited about Postgres. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: [ I can already hear somebody insisting on a yaml type :-( ] Now that's a case where I think a couple of converter functions at most should meet the need. Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? (If so, this patch ought to be hstore_to_xml instead.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: There are several pg_migrator limitations that appeared late in the 8.4 development cycle and were impossible to fix at that point. I would like to fix them for Postgres 8.5: o a user-defined composite data type o a user-defined array data type o a user-defined enum data type I have discussed this with Alvaro. I think pg_migrator needs the ability to set the pg_type.oid and pg_enum.oid for user-defined composites, arrays, and enums to match the values in the old server, and hence match references to those rows in user data tables. To be more precise, the pg_enum.oid needs to be set for ENUM types; there's no need for setting the pg_type.oid (for ENUM types). I don't know about composites but I think the problem with user defined arrays is the OID of the element type, not the array itself. The general solution will involve creating place-hold rows in pg_type and pg_enum with the desired oids, and deleting those placeholder rows at the time pg_dump creates the new type or enum, and passing the desired oid to the creation command. I don't think there's a need for pg_enum placeholders. Just create them with the correct OIDs as the first step. Nobody else is going to use pg_enum.oids anyway. Again, I don't know about arrays or composites. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Tom Lane escribió: Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: [ I can already hear somebody insisting on a yaml type :-( ] Now that's a case where I think a couple of converter functions at most should meet the need. Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? (If so, this patch ought to be hstore_to_xml instead.) But then there's the matter of overhead: how much would be wasted by transforming to XML, and then parsing the XML back to transform to JSON? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? (If so, this patch ought to be hstore_to_xml instead.) But then there's the matter of overhead: how much would be wasted by transforming to XML, and then parsing the XML back to transform to JSON? Well, that would presumably happen only when sending data to or from the client. It's not obvious that it would be much more expensive than the syntax checking you'd have to do anyway. If there's some reason to think that operating on json data would be much less expensive than operating on xml, there might be a case for having two distinct sets of operations internally, but I haven't heard anybody make that argument. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
Hi, Le 18 déc. 2009 à 19:21, Heikki Linnakangas a écrit : On Fri, Dec 18, 2009 at 12:22 PM, Florian Pflug fgp.phlo@gmail.com wrote: It'd prefer if the slave could automatically fetch a new base backup if it falls behind too far to catch up with the available logs. That way, old logs don't start piling up on the server if a slave goes offline for a long time. Well I did propose to consider a state machine with clear transition for such problems, a while ago, and I think my remarks still do apply: http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg131511.html Sorry for non archives.postgresql.org link, couldn't find the mail there. Yeah, for small databases, it's probably a better tradeoff. The problem with keeping WAL around in the master indefinitely is that you will eventually run out of disk space if the standby disappears for too long. I'd vote for having a setting on the master for how long you keep WALs. If slave loose sync then comes back, either you still have the required WALs and you're back to catchup or you don't and you're back either to base/init dance. Maybe you want to add a control on the slave to require explicit DBA action before getting back to taking a base backup from the master, though, as that could be provided from a nightly PITR backup rather than the live server. but it's almost certainly much harder to implement. In particular, there's no hard and fast rule for figuring out when you've dropped so far behind that resnapping the whole thing is faster than replaying the WAL bit by bit. I'd imagine that you take a new base backup only if you have to, ie. the old WAL files the slave needs have already been deleted from the master. Well consider a slave can be in one of those states: base, init, setup, catchup, sync. Now what you just said is reduced to saying what transitions you can do without resorting to base backup, and I don't see that many as soon as the last sync point is no more available on the master. I think (as I did/do with Hot Standby) that the most important thing here is to get to a point where we have a reasonably good feature that is of some use, and commit it. It will probably have some annoying limitations; we can remove those later. I have a feel that what we have right now is going to be non-robust in the face of network breaks, but that is a problem that can be fixed by a future patch. Agreed. About a year ago, I was vocal about not relying on the file based shipping, but I don't have a problem with relying on it as an intermediate step, until we add the other options. It's robust as it is, if you set up WAL archiving. I think I'd like to have the feature that a slave never pretends it's in-sync or soon-to-be when clearly it's not. For the asynchronous case, we can live with it. As soon as we're talking synchronous, you really want the master to skip any not-in-sync slave at COMMIT. To be even more clear, a slave that is not in sync is NOT a slave as far as synchronous replication is concerned. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Alvaro Herrera wrote: Bruce Momjian wrote: There are several pg_migrator limitations that appeared late in the 8.4 development cycle and were impossible to fix at that point. I would like to fix them for Postgres 8.5: o a user-defined composite data type o a user-defined array data type o a user-defined enum data type I have discussed this with Alvaro. I think pg_migrator needs the ability to set the pg_type.oid and pg_enum.oid for user-defined composites, arrays, and enums to match the values in the old server, and hence match references to those rows in user data tables. To be more precise, the pg_enum.oid needs to be set for ENUM types; there's no need for setting the pg_type.oid (for ENUM types). I don't know about composites but I think the problem with user defined arrays is the OID of the element type, not the array itself. Yes, good point. I can see where the oids are assigned in our C code: oids[i] = GetNewOid(pg_enum); array_oid = GetNewOid(pg_type); I need a way of controlling that. Now, ideally, I would just be able to add an optional oid field to DefineType() and call it from a server-side C function called by pg_migrator, but the problem is that that function assumes it is receiving a complex struct DefineStmt which can't easily be created by pg_migrator. The general solution will involve creating place-hold rows in pg_type and pg_enum with the desired oids, and deleting those placeholder rows at the time pg_dump creates the new type or enum, and passing the desired oid to the creation command. I don't think there's a need for pg_enum placeholders. Just create them with the correct OIDs as the first step. Nobody else is going to use pg_enum.oids anyway. Again, I don't know about arrays or composites. That will make things easier because of the large number of oids consumed by enumerated types. I am now thinking that setting the oid counter before calling CREATE TYPE/ENUM might be the cleanest, and of course with pg_dump setting this all up when in --binary-upgrade mode. It does make pg_migrator dependent on the order of oid allocation in those routines. It also might make some migrations impossible if concurrent enum creation caused gaps in the assignment of oids in a single enumerated type. A crazier idea would be for pg_migrator to set server-side global variables that contain the oids to be used. pg_dump would call those functions to set and clear the global variables when in --binary-upgrade mode, and the backend code would consult those variables before calling GetNewOid(), or GetNewOid() would consult those global variables. You can now see why this was not fixed in 8.4. :-( -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
I wrote: [for a description of traditional techniques for providing various isolation levels, including serializable], Dr. Cahill seemed to like (Hellerstein et al., 2007) If anyone else is interested in this paper, here is additional information: Architecture of a Database System. (Joseph M. Hellerstein, Michael Stonebraker and James Hamilton). Foundations and Trends in Databases 1(2). http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf It covers a lot of ground, not just locking and latching issues. While the discussion seems very good and very clear, it doesn't get down to low level locking details -- instead referring people to: J. Gray, R. A. Lorie, G. R. Putzolu, and I. L. Traiger, *Granularity of locks and degrees of consistency in a shared data base,* in IFIP Working Conference on Modelling in Data Base Management Systems, pp. 365*394, 1976. http://www.seas.upenn.edu/~zives/05s/cis650/papers/granularity-locks.pdf This 1976 paper is the one which gets down to the nitty gritty details of how to effectively implement predicate locking with reasonable performance using index range locks, etc. This is the paper which should cover most of the questions people raise on this list where Dr. Cahill has just assumed that the traditional techniques he seeks to improve upon are well known to his audience. These techniques, or some variation on them, have been implemented in almost every database I've used or investigated. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] snapshot tarball generation broken for -HEAD
Hi all! Infrastructure monitoring started to complain a few days ago that we failed generating new snapshot-tarball for HEAD. Manual investigation shows that the script dies while building the docs with: openjade:installation.sgml:202:51:X: reference to non-existent ID PLPYTHON-PYTHON23 openjade:/usr/local/share/sgml/docbook/dsssl/modular/html/dblink.dsl:203:1:E: XRef LinkEnd to missing ID 'PLPYTHON-PYTHON23' gmake[1]: *** [INSTALL.html] Error 1 gmake[1]: *** Deleting file `INSTALL.html' gmake: *** [distdir] Error 2 any ideas? Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Fri, Dec 18, 2009 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? (If so, this patch ought to be hstore_to_xml instead.) But then there's the matter of overhead: how much would be wasted by transforming to XML, and then parsing the XML back to transform to JSON? Well, that would presumably happen only when sending data to or from the client. It's not obvious that it would be much more expensive than the syntax checking you'd have to do anyway. If there's some reason to think that operating on json data would be much less expensive than operating on xml, there might be a case for having two distinct sets of operations internally, but I haven't heard anybody make that argument. One problem is that there is not a single well-defined mapping between these types. I would say generally that XML and YAML both have more types of constructs than JSON. The obvious ways of translating an arbitrary XML document to JSON are likely not to be what people want in particular cases. I think the performance argument is compelling, too, but we can't even try benchmarking it unless we can define what we're even talking about. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
+1 for such a feature, simply to avoid the need of writing a hstore-parser (which wasn't too bad to write, but it felt unnecessary). Doesn't matter to me if it's hstore-to-json or hstore-to-xml or hstore-to-yaml. Just something that parsers are readily available for. Heck, I wouldn't mind if hstore moved to using any one of those for it's external representations by default. Tom Lane wrote: a ton of special syntax for xml support, ...a json type... [ I can already hear somebody insisting on a yaml type :-( ] If these were CPAN-like installable modules, I'd hope there would be eventually. Don't most languages and platforms have both YAML and JSON libraries? Yaml's user-defined types are an example of where this might be useful eventually. Tom Lane wrote: Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? I imagine eventually a JSON type could validate fields using JSON Schema. But that's drifting away from hstore. (If so, this patch ought to be hstore_to_xml instead.) Doesn't matter to me so long as it's any format with readily available parsers. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] snapshot tarball generation broken for -HEAD
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes: openjade:installation.sgml:202:51:X: reference to non-existent ID PLPYTHON-PYTHON23 openjade:/usr/local/share/sgml/docbook/dsssl/modular/html/dblink.dsl:203:1:E: XRef LinkEnd to missing ID 'PLPYTHON-PYTHON23' Looks like Peter forgot about the restrictions on links in doc sections that also get made into standalone text files. Will fix. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Installing PL/pgSQL by default
Bruce Momjian wrote: Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I installed PL/pgSQL by default via initdb with the attached patch. The only problem is that pg_dump still dumps out the language creation: CREATE PROCEDURAL LANGUAGE plpgsql; ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres; What is odd is that I used the same process that initdb uses to create other objects. Does anyone know why this is happening? I think pg_dump pays attention to what schema the objects are in, and that's most likely creating them in PUBLIC. Try adding set search_path = pg_catalog. It's not impossible that we'll have to tweak pg_dump a bit; it's never had to deal with languages that shouldn't be dumped ... I found that pg_dump tests for pg_language.lanispl == true, which is true for all the stored procedure languages. I can easily special case plpgsql, or check for FirstNormalObjectId, though I don't see that used in pg_dump currently. A more difficult issue is whether we should preserve the fact that plpgsql was _removed_ in the pg_dump output, i.e, if someone removes plpgsql from a database, do we issue a DROP LANGUAGE in pg_dump? I don't remember us having to deal with anything like this before. OK, the attached patch installs plpgsql by default from initdb, and supresses the dumping of CREATE LANGUAGE in 8.5 and in 8.3/8.4 if binary upgrade is used (because you know you are upgrading to a release that has plpgsql installed by default). The 8.3/8.4 is necessary so the schema load doesn't generate any errors and cause pg_migrator to exit. Applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On fre, 2009-12-18 at 11:51 -0500, Robert Haas wrote: On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler da...@kineticode.com wrote: On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text? json_data_type++ What would that do for us? At the moment it would be more of a placeholder, because if we later decide to add full-blown JSON-constructing and -destructing functionality, it would be difficult to change the signatures of all the existing functionality. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: There are several pg_migrator limitations that appeared late in the 8.4 development cycle and were impossible to fix at that point. I would like to fix them for Postgres 8.5: o a user-defined composite data type o a user-defined array data type o a user-defined enum data type I have discussed this with Alvaro. I think pg_migrator needs the ability to set the pg_type.oid and pg_enum.oid for user-defined composites, arrays, and enums to match the values in the old server, and hence match references to those rows in user data tables. To be more precise, the pg_enum.oid needs to be set for ENUM types; there's no need for setting the pg_type.oid (for ENUM types). I don't know about composites but I think the problem with user defined arrays is the OID of the element type, not the array itself. Yes, good point. I can see where the oids are assigned in our C code: oids[i] = GetNewOid(pg_enum); array_oid = GetNewOid(pg_type); I need a way of controlling that. You're (partly?) missing my point which is that the important OID to control is the one that actually gets stored on table files. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Alvaro Herrera wrote: Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: There are several pg_migrator limitations that appeared late in the 8.4 development cycle and were impossible to fix at that point. I would like to fix them for Postgres 8.5: o a user-defined composite data type o a user-defined array data type o a user-defined enum data type I have discussed this with Alvaro. I think pg_migrator needs the ability to set the pg_type.oid and pg_enum.oid for user-defined composites, arrays, and enums to match the values in the old server, and hence match references to those rows in user data tables. To be more precise, the pg_enum.oid needs to be set for ENUM types; there's no need for setting the pg_type.oid (for ENUM types). I don't know about composites but I think the problem with user defined arrays is the OID of the element type, not the array itself. Yes, good point. I can see where the oids are assigned in our C code: oids[i] = GetNewOid(pg_enum); array_oid = GetNewOid(pg_type); I need a way of controlling that. You're (partly?) missing my point which is that the important OID to control is the one that actually gets stored on table files. Well, I thought the idea was to set the system table oid to match the oids already in the user tables. I realize that is not all system oids. What am I missing exactly? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: Alvaro Herrera wrote: To be more precise, the pg_enum.oid needs to be set for ENUM types; there's no need for setting the pg_type.oid (for ENUM types). I don't know about composites but I think the problem with user defined arrays is the OID of the element type, not the array itself. Yes, good point. I can see where the oids are assigned in our C code: oids[i] = GetNewOid(pg_enum); array_oid = GetNewOid(pg_type); I need a way of controlling that. You're (partly?) missing my point which is that the important OID to control is the one that actually gets stored on table files. Well, I thought the idea was to set the system table oid to match the oids already in the user tables. I realize that is not all system oids. What am I missing exactly? I think the OIDs for user-defined arrays stored in table data are element types, not the array type which is what you're pointing at with the line you quote: array_oid = GetNewOid(pg_type); IMBFOS. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Alvaro Herrera wrote: Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: Alvaro Herrera wrote: To be more precise, the pg_enum.oid needs to be set for ENUM types; there's no need for setting the pg_type.oid (for ENUM types). I don't know about composites but I think the problem with user defined arrays is the OID of the element type, not the array itself. Yes, good point. I can see where the oids are assigned in our C code: oids[i] = GetNewOid(pg_enum); array_oid = GetNewOid(pg_type); I need a way of controlling that. You're (partly?) missing my point which is that the important OID to control is the one that actually gets stored on table files. Well, I thought the idea was to set the system table oid to match the oids already in the user tables. I realize that is not all system oids. What am I missing exactly? I think the OIDs for user-defined arrays stored in table data are element types, not the array type which is what you're pointing at with the line you quote: array_oid = GetNewOid(pg_type); IMBFOS. Oh, yea, sorry, I was just showing examples of where we get the oids --- I have not researched the exact calls yet, but I am doing that now and will apply a patch that adds C comments to the C structures to identify them. I figure it would be good to document this no matter what we do. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: There are several pg_migrator limitations that appeared late in the 8.4 development cycle and were impossible to fix at that point. I would like to fix them for Postgres 8.5: o a user-defined composite data type o a user-defined array data type o a user-defined enum data type I have discussed this with Alvaro. I think pg_migrator needs the ability to set the pg_type.oid and pg_enum.oid for user-defined composites, arrays, and enums to match the values in the old server, and hence match references to those rows in user data tables. The general solution will involve creating place-hold rows in pg_type and pg_enum with the desired oids, and deleting those placeholder rows at the time pg_dump creates the new type or enum, and passing the desired oid to the creation command. We do something similar for toast tables now, but it is easier there because the oids are actually file system files. There is no ability to specify an OID column value on insert. However, pg_migrator has the ability to call backend C functions via shared library functions so it could potentially insert the needed system catalog dummy rows. As far as creating rows with the proper oids, we could modify the SQL grammar to allow it, or modify DefineType() so it accepts oids and passes them to TypeCreate(), or a simpler approach would be to set the oid counter before calling CREATE TYPE, but that would be error-prone because other oids might be assigned in indeterminate order --- we removed that code from pg_migrator for toast tables before 8.4 shipped, so I am not excited to re-add it. The same approach is necessary for enums. Another approach could be to create the dummy rows, load all of the pg_dump schema, then renumber the rows to the proper oids, but this assumes that I will be able to find all references to the current oids and renumber those too. Seems I need some help here. I thought there was a suggestion that we would be able to specify the oids in the SQL that creates the types, along the lines of: create type foo as enum ( ...) with oids ( ... ); Is that a non-starter? I imagine it would need to require some special setting to be enabled to allow it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Fri, Dec 18, 2009 at 4:39 PM, Peter Eisentraut pete...@gmx.net wrote: On fre, 2009-12-18 at 11:51 -0500, Robert Haas wrote: On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler da...@kineticode.com wrote: On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text? json_data_type++ What would that do for us? At the moment it would be more of a placeholder, because if we later decide to add full-blown JSON-constructing and -destructing functionality, it would be difficult to change the signatures of all the existing functionality. Good thought. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] snapshot tarball generation broken for -HEAD
On Fri, Dec 18, 2009 at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Looks like Peter forgot about the restrictions on links in doc sections that also get made into standalone text files. Will fix. I seem to have... forgotten... those as well. Can you enlighten me? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
On Fri, Dec 18, 2009 at 6:44 PM, Andrew Dunstan and...@dunslane.net wrote: I thought there was a suggestion that we would be able to specify the oids in the SQL that creates the types, along the lines of: create type foo as enum ( ...) with oids ( ... ); Is that a non-starter? I imagine it would need to require some special setting to be enabled to allow it. This gets at a question that I've been wondering about. There seems to be something about OIDs that makes us want to not ever allow users to specify them, or only when our back is absolutely against the wall. I have only the vaguest notions of what might be dangerous about that, though. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Robert Haas wrote: On Fri, Dec 18, 2009 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? (If so, this patch ought to be hstore_to_xml instead.) But then there's the matter of overhead: how much would be wasted by transforming to XML, and then parsing the XML back to transform to JSON? Well, that would presumably happen only when sending data to or from the client. It's not obvious that it would be much more expensive than the syntax checking you'd have to do anyway. If there's some reason to think that operating on json data would be much less expensive than operating on xml, there might be a case for having two distinct sets of operations internally, but I haven't heard anybody make that argument. One problem is that there is not a single well-defined mapping between these types. I would say generally that XML and YAML both have more types of constructs than JSON. The obvious ways of translating an arbitrary XML document to JSON are likely not to be what people want in particular cases. Right. XML semantics are richer, as I pointed out when we were discussing the various EXPLAIN formats. I think the performance argument is compelling, too, but we can't even try benchmarking it unless we can define what we're even talking about. Yes, there is indeed reason to think that JSON processing, especially parsing, will be more efficient, and I suspect we can provide ways of accessing the data that are lots faster than XPath. JSON is designed to be lightweight, XML is not. Mind you, the XML processing is not too bad - I have been working much of the last few months on a large custom billing system which produces XML output to create paper/online invoices from, and the XML construction is one of the fastest parts of the whole system. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: Seems I need some help here. I'm willing to work on this --- it doesn't look particularly fun but we really need it. Andrew Dunstan and...@dunslane.net writes: I thought there was a suggestion that we would be able to specify the oids in the SQL that creates the types, along the lines of: create type foo as enum ( ...) with oids ( ... ); Is that a non-starter? I imagine it would need to require some special setting to be enabled to allow it. The more I think about it the less I want such warts placed in the regular SQL syntax for creation commands. As soon as we add a wart like that we'll be stuck with supporting it forever. Whatever we do here should be off in a little corner that only pg_migrator can get at. And we already have a way to manage that: there's already something in pg_migrator to let it install special functions that are present only while migrating. So I suggest that we make whatever hacks are needed available only at the C-code level, and let pg_migrator get at them via its special functions. In practice, this would mean teaching pg_dump to call these functions when it is making a --binary_upgrade dump. The reason I think this is less of a support hazard than changing SQL statements is that there is no promise or intention that a --binary_upgrade dump will load into anything but the specific PG version that it's intended for. (We could, and probably should, add some version labeling to the dump to help enforce that.) At the moment it appears that we need the following hacks: * ability to control the OIDs assigned to user tables and types. Because a table also has a rowtype, this means at least two separate state variables. And we already knew we had to control the OIDs assigned to toast tables. I'm imagining dump output like select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); CREATE TABLE ... where the functions cause static variables to become set, and the core code gets changed to look like if (next_table_oid) { newoid = next_table_oid; next_table_oid = 0; } else newoid = GetNewOid(...); in selected places where currently there's just a GetNewOid(...) call. * ability to control the OIDs assigned to enum values. To keep this sane I think the easiest way is to have pg_migrator have a function that adds one value with a predetermined OID to an existing enum. So instead of CREATE TYPE foo AS ENUM ('bar', 'baz', ...) I envision the --binary_upgrade dump output looking like -- force the OID of the enum type itself select pg_migrator_set_next_type_oid(12347); CREATE TYPE foo AS ENUM (); select pg_migrator_add_enum_value(12347, 'bar', 12348); select pg_migrator_add_enum_value(12347, 'baz', 12349); ... I don't see any value in the placeholder-row approach Bruce suggests; AFAICS it would require significantly uglier backend hacks than the above because dealing with an already-present row would be a bigger code change. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] snapshot tarball generation broken for -HEAD
Robert Haas robertmh...@gmail.com writes: On Fri, Dec 18, 2009 at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Looks like Peter forgot about the restrictions on links in doc sections that also get made into standalone text files. Will fix. I seem to have... forgotten... those as well. Can you enlighten me? Try make HISTORY and make INSTALL in doc/src/sgml. If they don't work, you need to hack up the SGML with standalone-include/standalone-ignore. See existing examples in the release notes and installation.sgml. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Fri, Dec 18, 2009 at 7:05 PM, Andrew Dunstan and...@dunslane.net wrote: One problem is that there is not a single well-defined mapping between these types. I would say generally that XML and YAML both have more types of constructs than JSON. The obvious ways of translating an arbitrary XML document to JSON are likely not to be what people want in particular cases. Right. XML semantics are richer, as I pointed out when we were discussing the various EXPLAIN formats. You say richer; I say harder to map onto data structures. But we can agree to disagree on this one... I'm sure there are good tools out there. :-) I think the performance argument is compelling, too, but we can't even try benchmarking it unless we can define what we're even talking about. Yes, there is indeed reason to think that JSON processing, especially parsing, will be more efficient, and I suspect we can provide ways of accessing the data that are lots faster than XPath. JSON is designed to be lightweight, XML is not. Mind you, the XML processing is not too bad - I have been working much of the last few months on a large custom billing system which produces XML output to create paper/online invoices from, and the XML construction is one of the fastest parts of the whole system. That doesn't surprise me very much. If there's a problem with operations on XML, I think it tends to be more on the parsing side than the generation side. But even there I agree it's not terrible. The main reason I like JSON is for the simpler semantics - there's exactly one way to serialize and deserialize a data structure, and everyone agrees on what it is so the error cases are all handled by the parser itself, rather than left to the application programmer. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: * ability to control the OIDs assigned to user tables and types. Because a table also has a rowtype, this means at least two separate state variables. And we already knew we had to control the OIDs assigned to toast tables. I'm imagining dump output like select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); CREATE TABLE ... Do we also need a knob for the table type's array type? * ability to control the OIDs assigned to enum values. To keep this sane I think the easiest way is to have pg_migrator have a function that adds one value with a predetermined OID to an existing enum. So instead of CREATE TYPE foo AS ENUM ('bar', 'baz', ...) I envision the --binary_upgrade dump output looking like -- force the OID of the enum type itself select pg_migrator_set_next_type_oid(12347); This part isn't necessary AFAIK, except to be used as reference here: CREATE TYPE foo AS ENUM (); select pg_migrator_add_enum_value(12347, 'bar', 12348); select pg_migrator_add_enum_value(12347, 'baz', 12349); on which we could perhaps use foo as a reference instead of the OID value. However, I think array and composite types need a specific type OID, so the set_next_type_oid function would still be necessary. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: At the moment it appears that we need the following hacks: * ability to control the OIDs assigned to user tables and types. Because a table also has a rowtype, this means at least two separate state variables. And we already knew we had to control the OIDs assigned to toast tables. I'm imagining dump output like select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); CREATE TABLE ... where the functions cause static variables to become set, and the core code gets changed to look like if (next_table_oid) { newoid = next_table_oid; next_table_oid = 0; } else newoid = GetNewOid(...); in selected places where currently there's just a GetNewOid(...) call. * ability to control the OIDs assigned to enum values. To keep this sane I think the easiest way is to have pg_migrator have a function that adds one value with a predetermined OID to an existing enum. So instead of CREATE TYPE foo AS ENUM ('bar', 'baz', ...) I envision the --binary_upgrade dump output looking like -- force the OID of the enum type itself select pg_migrator_set_next_type_oid(12347); CREATE TYPE foo AS ENUM (); select pg_migrator_add_enum_value(12347, 'bar', 12348); select pg_migrator_add_enum_value(12347, 'baz', 12349); ... I don't see any value in the placeholder-row approach Bruce suggests; AFAICS it would require significantly uglier backend hacks than the above because dealing with an already-present row would be a bigger code change. Comments? That looks fairly workable. The placeholder idea seems like a bit of a potential footgun, so I like the idea that we can in some limited circumstances set the oids fairly directly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); CREATE TABLE ... Do we also need a knob for the table type's array type? Well, we wouldn't care about the oid of the array type, except that if the backend is allowed to assign it on its own, it might eat an oid that we're going to need later for another type. So yeah, array oids too. (The above is just a sketch, I don't promise it's complete ;-)) -- force the OID of the enum type itself select pg_migrator_set_next_type_oid(12347); This part isn't necessary AFAIK, except to be used as reference here: CREATE TYPE foo AS ENUM (); Exactly. We have to assign the oid of the enum type just as much as any other type. Basically, to avoid collisions we'll need to ensure we nail down the oids of every pg_class and pg_type row to be the same as they were before. We might have to nail down relfilenodes similarly, not sure yet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
Dimitri Fontaine escribió: Well I did propose to consider a state machine with clear transition for such problems, a while ago, and I think my remarks still do apply: http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg131511.html Sorry for non archives.postgresql.org link, couldn't find the mail there. http://archives.postgresql.org/message-id/87fxcxnjwt.fsf%40hi-media-techno.com -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: I think the OIDs for user-defined arrays stored in table data are element types, not the array type which is what you're pointing at with the line you quote: array_oid = GetNewOid(pg_type); IMBFOS. Oh, yea, sorry, I was just showing examples of where we get the oids --- I have not researched the exact calls yet, but I am doing that now and will apply a patch that adds C comments to the C structures to identify them. I figure it would be good to document this no matter what we do. I have applied the attached patch which documents the locations where system oids have to be preserved for binary upgrades. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/catalog/pg_enum.c === RCS file: /cvsroot/pgsql/src/backend/catalog/pg_enum.c,v retrieving revision 1.9 diff -c -c -r1.9 pg_enum.c *** src/backend/catalog/pg_enum.c 1 Jan 2009 17:23:37 - 1.9 --- src/backend/catalog/pg_enum.c 19 Dec 2009 00:46:10 - *** *** 67,72 --- 67,76 oids = (Oid *) palloc(n * sizeof(Oid)); for (i = 0; i n; i++) { + /* + * The pg_enum.oid is stored in user tables. This oid must be + * preserved by binary upgrades. + */ oids[i] = GetNewOid(pg_enum); } Index: src/backend/commands/typecmds.c === RCS file: /cvsroot/pgsql/src/backend/commands/typecmds.c,v retrieving revision 1.139 diff -c -c -r1.139 typecmds.c *** src/backend/commands/typecmds.c 7 Dec 2009 05:22:21 - 1.139 --- src/backend/commands/typecmds.c 19 Dec 2009 00:46:10 - *** *** 531,536 --- 531,542 * now have TypeCreate do all the real work. */ typoid = + /* + * The pg_type.oid is stored in user tables as array elements + * (base types) in ArrayType and in composite types in + * DatumTupleFields. This oid must be preserved by binary + * upgrades. + */ TypeCreate(InvalidOid, /* no predetermined type OID */ typeName, /* type name */ typeNamespace, /* namespace */ Index: src/backend/utils/adt/arrayfuncs.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/arrayfuncs.c,v retrieving revision 1.161 diff -c -c -r1.161 arrayfuncs.c *** src/backend/utils/adt/arrayfuncs.c 4 Sep 2009 11:20:22 - 1.161 --- src/backend/utils/adt/arrayfuncs.c 19 Dec 2009 00:46:13 - *** *** 328,333 --- 328,338 SET_VARSIZE(retval, nbytes); retval-ndim = ndim; retval-dataoffset = dataoffset; + /* + * This comes from the array's pg_type.typelem (which points to the + * base data type's pg_type.oid) and stores system oids in user tables. + * This oid must be preserved by binary upgrades. + */ retval-elemtype = element_type; memcpy(ARR_DIMS(retval), dim, ndim * sizeof(int)); memcpy(ARR_LBOUND(retval), lBound, ndim * sizeof(int)); Index: src/backend/utils/adt/enum.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/enum.c,v retrieving revision 1.7 diff -c -c -r1.7 enum.c *** src/backend/utils/adt/enum.c 1 Jan 2009 17:23:49 - 1.7 --- src/backend/utils/adt/enum.c 19 Dec 2009 00:46:13 - *** *** 56,61 --- 56,65 format_type_be(enumtypoid), name))); + /* + * This comes from pg_enum.oid and stores system oids in user tables. + * This oid must be preserved by binary upgrades. + */ enumoid = HeapTupleGetOid(tup); ReleaseSysCache(tup); Index: src/backend/utils/adt/rowtypes.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/rowtypes.c,v retrieving revision 1.25 diff -c -c -r1.25 rowtypes.c *** src/backend/utils/adt/rowtypes.c 11 Jun 2009 14:49:04 - 1.25 --- src/backend/utils/adt/rowtypes.c 19 Dec 2009 00:46:14 - *** *** 97,102 --- 97,107 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg(input of anonymous composite types is not implemented))); tupTypmod = -1;/* for all non-anonymous types */ + /* + * This comes from the composite type's pg_type.oid and + * stores system oids in user tables, specifically DatumTupleFields. + * This oid must be preserved by binary upgrades. + */ tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod); ncolumns = tupdesc-natts; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: The more I think about it the less I want such warts placed in the regular SQL syntax for creation commands. As soon as we add a wart like that we'll be stuck with supporting it forever. Whatever we do here should be off in a little corner that only pg_migrator can get at. Yea, and we might need more some day so a system that can be easily enhanced would help. Adding to SQL syntax and maintaining it seems like overkill. And we already have a way to manage that: there's already something in pg_migrator to let it install special functions that are present only while migrating. So I suggest that we make whatever hacks are needed available only at the C-code level, and let pg_migrator get at them via its special functions. Right. In practice, this would mean teaching pg_dump to call these functions when it is making a --binary_upgrade dump. The reason I think this is less of a support hazard than changing SQL statements is that there is no promise or intention that a --binary_upgrade dump will load into anything but the specific PG version that it's intended for. (We could, and probably should, add some version labeling to the dump to help enforce that.) Yea, that is easy. At the moment it appears that we need the following hacks: * ability to control the OIDs assigned to user tables and types. Because a table also has a rowtype, this means at least two separate state variables. And we already knew we had to control the OIDs assigned to toast tables. I'm imagining dump output like select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); I was thinking of something even more general: select pg_migrator_set_oid('pg_type', 100); select pg_migrator_set_oid('pg_type_array', 101); and you just check for the strings in pg_migrator_set_oid and set the proper variable. The idea I had was to create a global structure: struct pg_migrator_oids { Oid pg_type; Oid pg_type_array; ... } This would initialize to zero as a global structure, and only pg_migrator server-side functions set it. CREATE TABLE ... where the functions cause static variables to become set, and the core code gets changed to look like if (next_table_oid) { newoid = next_table_oid; next_table_oid = 0; } else newoid = GetNewOid(...); Yes, that is what I was thinking too: if (pg_migrator_oid.pg_type) { newoid = pg_migrator_oid.pg_type; pg_migrator_oid.pg_type = 0; } else newoid = GetNewOid(...); in selected places where currently there's just a GetNewOid(...) call. * ability to control the OIDs assigned to enum values. To keep this sane I think the easiest way is to have pg_migrator have a function that adds one value with a predetermined OID to an existing enum. So instead of CREATE TYPE foo AS ENUM ('bar', 'baz', ...) I envision the --binary_upgrade dump output looking like -- force the OID of the enum type itself select pg_migrator_set_next_type_oid(12347); CREATE TYPE foo AS ENUM (); select pg_migrator_add_enum_value(12347, 'bar', 12348); select pg_migrator_add_enum_value(12347, 'baz', 12349); ... Good idea --- I was trying to figure out how to assign an array of oids and couldn't think of a simple way. I don't see any value in the placeholder-row approach Bruce suggests; AFAICS it would require significantly uglier backend hacks than the above because dealing with an already-present row would be a bigger code change. True. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); CREATE TABLE ... Do we also need a knob for the table type's array type? Well, we wouldn't care about the oid of the array type, except that if the backend is allowed to assign it on its own, it might eat an oid that we're going to need later for another type. So yeah, array oids too. (The above is just a sketch, I don't promise it's complete ;-)) -- force the OID of the enum type itself select pg_migrator_set_next_type_oid(12347); This part isn't necessary AFAIK, except to be used as reference here: CREATE TYPE foo AS ENUM (); Exactly. We have to assign the oid of the enum type just as much as any other type. Basically, to avoid collisions we'll need to ensure we nail down the oids of every pg_class and pg_type row to be the same as they I assume you meant pg_type and pg_class above, or I hope you were. were before. We might have to nail down relfilenodes similarly, not sure yet. Yea, piggybacking on Alvaro's idea for pg_enum, if we set all the pg_type oids we can clearly do this with no placeholders necessary. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
On 12/18/2009 04:09 PM, Tom Lane wrote: At the moment it appears that we need the following hacks: * ability to control the OIDs assigned to user tables and types. Because a table also has a rowtype, this means at least two separate state variables. And we already knew we had to control the OIDs assigned to toast tables. I'm imagining dump output like select pg_migrator_set_next_table_oid(123456); select pg_migrator_set_next_type_oid(12347); select pg_migrator_set_next_toast_table_oid(123458); CREATE TABLE ... I like this approach overall, but wonder if it would be better to do: select pg_migrator_set_next_oid('table', 123456); select pg_migrator_set_next_oid('type', 12347); select pg_migrator_set_next_oid('toast_table', 123458); etc. Later we could easily add other supported objects... Joe signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian br...@momjian.us writes: ... The idea I had was to create a global structure: struct pg_migrator_oids { Oid pg_type; Oid pg_type_array; ... } This would initialize to zero as a global structure, and only pg_migrator server-side functions set it. I would prefer *not* to do that, as that makes the list of settable oids far more public than I would like; also you are totally dependent on pg_migrator and the backend to be in sync about the definition of that struct, which is going to be problematic in alpha releases in particular, since PG_VERSION isn't going to distinguish them. What I had in mind was more like static Oid next_pg_class_oid = InvalidOid; void set_next_pg_class_oid(Oid oid) { next_pg_class_oid = oid; } in each module that needs to be able to accept a next-oid setting, and then the pg_migrator loadable module would expose SQL-callable wrappers for these functions. That way, any inconsistency shows up as a link error: function needed not present. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Joe Conway m...@joeconway.com writes: I like this approach overall, but wonder if it would be better to do: select pg_migrator_set_next_oid('table', 123456); select pg_migrator_set_next_oid('type', 12347); select pg_migrator_set_next_oid('toast_table', 123458); etc. Later we could easily add other supported objects... Yeah, Bruce was just suggesting the same. I do like that part of what he mentioned, just because it'll be fewer special functions to add and drop in pg_migrator. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
On Fri, Dec 18, 2009 at 9:00 AM, Robert Haas robertmh...@gmail.com wrote: 2009/12/18 KaiGai Kohei kai...@ak.jp.nec.com: (2009/12/18 15:48), Takahiro Itagaki wrote: Robert Haasrobertmh...@gmail.com wrote: In both cases, I'm lost. Help? They might be contrasted with the comments for myLargeObjectExists. Since we use MVCC visibility in loread(), metadata for large object also should be visible in MVCC rule. If I understand them, they say: * pg_largeobject_aclmask_snapshot requires a snapshot which will be used in loread(). * Don't use LargeObjectExists if you need MVCC visibility. Yes, correct. In acldefault(), there is this comment: /* Grant SELECT,UPDATE by default, for now */ This doesn't seem to match what the code is doing, so I think we should remove it. Ah, ACL_NO_RIGHTS is the default. Oops, it reflects very early phase design, but fixed later. I also notice that dumpBlobComments() is now misnamed, but it seems we've chosen to add a comment mentioning that fact rather than fixing it. Hmmm, now it dumps not only comments but also ownership of large objects. Should we rename it dumpBlobMetadata() or so? It seems to me quite natural. The attached patch fixes them. I think we might want to go with dumpBlobProperties(), because dumpBlobMetadata() might lead you to think that all of the properties being dumped are stored in pg_largeobject_metadata, which is not the case. Oh. This is more complicated than it appeared on the surface. It seems that the string BLOB COMMENTS actually gets inserted into custom dumps somewhere, so I'm not sure whether we can just change it. Was this issue discussed at some point before this was committed? Changing it would seem to require inserting some backward compatibility code here. Another option would be to add a separate section for BLOB METADATA, and leave BLOB COMMENTS alone. Can anyone comment on what the Right Thing To Do is here? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
On Fri, Dec 18, 2009 at 1:48 AM, Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote: In both cases, I'm lost. Help? They might be contrasted with the comments for myLargeObjectExists. Since we use MVCC visibility in loread(), metadata for large object also should be visible in MVCC rule. If I understand them, they say: * pg_largeobject_aclmask_snapshot requires a snapshot which will be used in loread(). * Don't use LargeObjectExists if you need MVCC visibility. Part of what I'm confused about (and what I think should be documented in a comment somewhere) is why we're using MVCC visibility in some places but not others. In particular, there seem to be some bits of the comment that imply that we do this for read but not for write, which seems really strange. It may or may not actually be strange, but I don't understand it. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Allow read only connections during recovery, known as Hot
On Fri, Dec 18, 2009 at 8:32 PM, Simon Riggs sri...@postgresql.org wrote: Log Message: --- Allow read only connections during recovery, known as Hot Standby. Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record. New test mode make standbycheck allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far. This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required. Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit. Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members. Modified Files: -- pgsql/doc/src/sgml: backup.sgml (r2.130 - r2.131) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/backup.sgml?r1=2.130r2=2.131) config.sgml (r1.238 - r1.239) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/config.sgml?r1=1.238r2=1.239) func.sgml (r1.493 - r1.494) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/func.sgml?r1=1.493r2=1.494) pgsql/doc/src/sgml/ref: checkpoint.sgml (r1.16 - r1.17) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/checkpoint.sgml?r1=1.16r2=1.17) pgsql/src/backend/access/gin: ginxlog.c (r1.19 - r1.20) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/gin/ginxlog.c?r1=1.19r2=1.20) pgsql/src/backend/access/gist: gistxlog.c (r1.32 - r1.33) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/gist/gistxlog.c?r1=1.32r2=1.33) pgsql/src/backend/access/heap: heapam.c (r1.278 - r1.279) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/heap/heapam.c?r1=1.278r2=1.279) pruneheap.c (r1.18 - r1.19) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/heap/pruneheap.c?r1=1.18r2=1.19) pgsql/src/backend/access/index: genam.c (r1.77 - r1.78) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/index/genam.c?r1=1.77r2=1.78) indexam.c (r1.115 - r1.116) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/index/indexam.c?r1=1.115r2=1.116) pgsql/src/backend/access/nbtree: README (r1.20 - r1.21) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/nbtree/README?r1=1.20r2=1.21) nbtinsert.c (r1.174 - r1.175) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/nbtree/nbtinsert.c?r1=1.174r2=1.175) nbtpage.c (r1.113 - r1.114) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/nbtree/nbtpage.c?r1=1.113r2=1.114) nbtree.c (r1.172 - r1.173) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/nbtree/nbtree.c?r1=1.172r2=1.173) nbtxlog.c (r1.55 - r1.56) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/nbtree/nbtxlog.c?r1=1.55r2=1.56) pgsql/src/backend/access/transam: README (r1.12 - r1.13) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/README?r1=1.12r2=1.13) clog.c (r1.53 - r1.54) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/clog.c?r1=1.53r2=1.54) multixact.c (r1.32 - r1.33) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/multixact.c?r1=1.32r2=1.33) recovery.conf.sample (r1.4 - r1.5) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/recovery.conf.sample?r1=1.4r2=1.5) rmgr.c (r1.27 - r1.28)
Re: [HACKERS] Largeobject Access Controls (r2460)
Robert Haas robertmh...@gmail.com writes: Oh. This is more complicated than it appeared on the surface. It seems that the string BLOB COMMENTS actually gets inserted into custom dumps somewhere, so I'm not sure whether we can just change it. Was this issue discussed at some point before this was committed? Changing it would seem to require inserting some backward compatibility code here. Another option would be to add a separate section for BLOB METADATA, and leave BLOB COMMENTS alone. Can anyone comment on what the Right Thing To Do is here? The BLOB COMMENTS label is, or was, correct for what it contained. If this patch has usurped it to contain other things I would argue that that is seriously wrong. pg_dump already has a clear notion of how to handle ACLs for objects. ACLs for blobs ought to be made to fit into that structure, not dumped in some random place because that saved a few lines of code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
Robert Haas robertmh...@gmail.com writes: Part of what I'm confused about (and what I think should be documented in a comment somewhere) is why we're using MVCC visibility in some places but not others. In particular, there seem to be some bits of the comment that imply that we do this for read but not for write, which seems really strange. It may or may not actually be strange, but I don't understand it. It is supposed to depend on whether you opened the blob for read only or for read write. Please do not tell me that this patch broke that; because if it did it broke pg_dump. This behavior is documented at least here: http://www.postgresql.org/docs/8.4/static/lo-interfaces.html#AEN36338 regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] About Allow VIEW/RULE recompilation when the underlying tables change
I'm doing the Todo List's Allow VIEW/RULE recompilation when the underlying tables change . I've a very simple idea that I save the create view query_string in systable. When I found that I select from a view, I drop the view and recreate the view by execute the source query_string. Then go on execute this select from a view and will get the right answer. My problem is that how could I get the orginal query which user typed to postgres? Er, maybe drop and recreate the view is an ugly idea...Is there any better methods to do recompilation? Thank you! 2009-12-19 suzhiyang
Re: [HACKERS] Largeobject Access Controls (r2460)
On Fri, Dec 18, 2009 at 9:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Oh. This is more complicated than it appeared on the surface. It seems that the string BLOB COMMENTS actually gets inserted into custom dumps somewhere, so I'm not sure whether we can just change it. Was this issue discussed at some point before this was committed? Changing it would seem to require inserting some backward compatibility code here. Another option would be to add a separate section for BLOB METADATA, and leave BLOB COMMENTS alone. Can anyone comment on what the Right Thing To Do is here? The BLOB COMMENTS label is, or was, correct for what it contained. If this patch has usurped it to contain other things It has. I would argue that that is seriously wrong. pg_dump already has a clear notion of how to handle ACLs for objects. ACLs for blobs ought to be made to fit into that structure, not dumped in some random place because that saved a few lines of code. OK. Hopefully KaiGai or Takahiro can suggest a fix. Thanks, ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
On Fri, Dec 18, 2009 at 9:51 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Part of what I'm confused about (and what I think should be documented in a comment somewhere) is why we're using MVCC visibility in some places but not others. In particular, there seem to be some bits of the comment that imply that we do this for read but not for write, which seems really strange. It may or may not actually be strange, but I don't understand it. It is supposed to depend on whether you opened the blob for read only or for read write. Please do not tell me that this patch broke that; because if it did it broke pg_dump. This behavior is documented at least here: http://www.postgresql.org/docs/8.4/static/lo-interfaces.html#AEN36338 Oh, I see. Thanks for the pointer. Having read that through, I can now say that the comments in the patch seem to imply that it attempted to preserve those semantics, but I can't swear that it did. I will take another look at it, but it might bear closer examination by someone with more MVCC-fu than myself. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About Allow VIEW/RULE recompilation when the underlying tables change
On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang suzhiy...@gmail.com wrote: I'm doing the Todo List's Allow VIEW/RULE recompilation when the underlying tables change . I've a very simple idea that I save the create view query_string in systable. When I found that I select from a view, I drop the view and recreate the view by execute the source query_string. Then go on execute this select from a view and will get the right answer. My problem is that how could I get the orginal query which user typed to postgres? Er, maybe drop and recreate the view is an ugly idea...Is there any better methods to do recompilation? I am not sure what this TODO item is supposed to refer to, but saving the query string and re-executing it is clearly not acceptable. What if a column or table or function referenced in the query has been renamed since the view/rule was created? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Re: [HACKERS] About Allow VIEW/RULE recompilation when theunderlying tables change
Yeah, when a column of the underlying table renamed, this re-executing must be failed... 2009-12-19 suzhiyang 发件人: Robert Haas 发送时间: 2009-12-19 11:17:44 收件人: suzhiyang 抄送: pgsql-hackers 主题: Re: [HACKERS] About Allow VIEW/RULE recompilation when theunderlying tables change On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang suzhiy...@gmail.com wrote: I'm doing the Todo List's Allow VIEW/RULE recompilation when the underlying tables change . I've a very simple idea that I save the create view query_string in systable. When I found that I select from a view, I drop the view and recreate the view by execute the source query_string. Then go on execute this select from a view and will get the right answer. My problem is that how could I get the orginal query which user typed to postgres? Er, maybe drop and recreate the view is an ugly idea...Is there any better methods to do recompilation? I am not sure what this TODO item is supposed to refer to, but saving the query string and re-executing it is clearly not acceptable. What if a column or table or function referenced in the query has been renamed since the view/rule was created? ...Robert
Re: [HACKERS] About Allow VIEW/RULE recompilation when the underlying tables change
Robert Haas robertmh...@gmail.com writes: On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang suzhiy...@gmail.com wrote: I'm doing the Todo List's Allow VIEW/RULE recompilation when the underlying tables change . I am not sure what this TODO item is supposed to refer to, but saving the query string and re-executing it is clearly not acceptable. What if a column or table or function referenced in the query has been renamed since the view/rule was created? The TODO item is terribly underdocumented, but I think what it's on about is that right now we refuse commands like ALTER COLUMN TYPE if the column is referenced in a view. It would be nice to propagate such a change into views if possible. The hard part is that the view might contain operations on the column --- such as func(x) or x+1 or ORDER BY x --- so you'd have to determine new semantics for those, which is why the term recompilation seems appropriate. The other point that is alluded to in the TODO item is that if the view was originally specified as SELECT * FROM foo then one might wonder whether ALTER TABLE foo ADD COLUMN bar adds the new column to the view too. Now the SQL standard is perfectly clear that the answer is no, but that hasn't dissuaded certain people from lobbying for us to do it anyway. As you say, the current implementation has a number of advantages that reread the original query string would not --- including compliance to the standard on the above point --- and I doubt we're going to want to give those up. So a patch that has ambitions of getting accepted is going to have to do something smarter. Don't know exactly what. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About Allow VIEW/RULE recompilation when the underlying tables change
On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang suzhiy...@gmail.com wrote: I'm doing the Todo List's Allow VIEW/RULE recompilation when the underlying tables change . I am not sure what this TODO item is supposed to refer to, but saving the query string and re-executing it is clearly not acceptable. What if a column or table or function referenced in the query has been renamed since the view/rule was created? The TODO item is terribly underdocumented, but I think what it's on about is that right now we refuse commands like ALTER COLUMN TYPE if the column is referenced in a view. It would be nice to propagate such a change into views if possible. The hard part is that the view might contain operations on the column --- such as func(x) or x+1 or ORDER BY x --- so you'd have to determine new semantics for those, which is why the term recompilation seems appropriate. I'm unconvinced that this is, as you would say, even mildly sane. I don't think changing the type of a column in one table constitutes a license to randomly reinterpret the semantics of views that reference it. Suppose that we have functions foo(int) and bar(varchar). Someone creates a view like this: SELECT foo(x.x1) FROM x; Next, they rename the function foo to bar. Now, they change the type of x1 from integer to varchar. Do you REALLY want to now go grab the other function? Because that sounds like an incredibly bad idea to me. I think we have a consistent principle that views, foreign key constraints, and all manner of objects generally within the database are tied to a particular object rather than a particular name. I think that's unquestionably a good decision, but even someone were to feel otherwise, I don't see how you can argue that we would SOMETIMES want to follow the object and OTHER TIMES the name. Maybe you could make an argument for treating foo(x.x1) call as foo(x.x1::integer) after the type of x.x1 gets changed, but even that seems suspiciously like black magic. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About Allow VIEW/RULE recompilation when the underlying tables change
Robert Haas robertmh...@gmail.com writes: On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: The TODO item is terribly underdocumented, but I think what it's on about is that right now we refuse commands like ALTER COLUMN TYPE if the column is referenced in a view. It would be nice to propagate such a change into views if possible. I'm unconvinced that this is, as you would say, even mildly sane. I've updated the TODO item to link to this discussion, so at least the next three people who pop up with let's just store the view source! will have some idea of what they're up against. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About Allow VIEW/RULE recompilation when the underlying tables change
On Fri, Dec 18, 2009 at 11:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: The TODO item is terribly underdocumented, but I think what it's on about is that right now we refuse commands like ALTER COLUMN TYPE if the column is referenced in a view. It would be nice to propagate such a change into views if possible. I'm unconvinced that this is, as you would say, even mildly sane. I've updated the TODO item to link to this discussion, so at least the next three people who pop up with let's just store the view source! will have some idea of what they're up against. Excellent. :-) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new CommitFest states
Robert Haas wrote: On Mon, Dec 14, 2009 at 12:38 PM, Greg Smith g...@2ndquadrant.com wrote: Robert Haas wrote: I don't think there should be a transition from Returned with Feedback back to Waiting for review. Granted we might allow that occasionally as an exceptional case, but normally Returned with Feedback is a final state. The main reason I put that in there is that sometimes a reviewer or even the CF manager (I did this myself once this time) will mark something Returned with feedback, thinking there's no way the issues pointed out can be addressed right now. And then, a day or two later, in comes a patch that does just that; surprise! Hmm, I'm not aware of any actual cases of this. I'm usually pretty conservative about jumping to RWF unless there's been lag or we're near the end of the CommitFest, so it doesn't come up. I've concluded that the times this happened was just me being too aggressive here to close some patches out after getting behind, and I removed the path you objected to out of the page as not to encourage that behavior. I think that http://wiki.postgresql.org/wiki/Running_a_CommitFest makes for a pretty reasonable and quite detailed set of guidelines now for the whole process, which means we've successfully gotten what Robert did to make things work well documented fully. All it's missing is for the Discussing review state to be an official one. I could undo things back to where it's not listed, but I do think it matches what we really do better and might as well be recognized as such. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] no lo_import(text, oid) document
Hi, I couldn't find lo_import(text, oid) document anywhere in the PostgreSQL core documents. Unless I'm missing something, I would like to add description for the function in 31.4. Server-Side Functions part. BTW, why don't lo_creat, lo_create, lo_unlink, lo_import and lo_export server side function's document appear in the Chapter 9. Functions and Operators section? Shall I add them? -- Tatsuo Ishii SRA OSS, Inc. Japan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Closing out CommitFest 2009-11
CommitFest 2009-11 is now closed, having committed 27 patches in 33 days. For comparison sake, 2009-09 committed 20 patches in 29 days, 2009-07 37 patches in 34 days, and 2008-09 29 patches in 30 days. The much bigger 2008-11 involved 58 patches going on for months, the bulk of it committed 28 patches in 36 days. Seems pretty consistent at this point: at the average patch contribution size seen over the last year, about one of those gets committed per day once we enter a CommitFest. I didn't bother accounting for things that were committed outside of the official dates, so it's actually a bit worse than that, but that gives a rough idea that's easy to remember. Also, just based on the last three CFs, 42% of patches are either returned with feedback or rejected (with quite a bit more CF to CF variation). The working estimation figure I'd suggest is that once a CF reaches 50 incoming patches it's unlikely that will finish in a month. CommitFest 2010-01, the last one for 8.5, begins on January 15th, 2010. I'll be out of commission with projects by then, so unless Robert wants to reprise his role as CF manager we may need to get someone else involved to do it. Between the CF application and how proactive everyone involved is at this point (almost all authors, reviewers, and committers do the bulk of the state changes and link to messages in the archives for you), the job of running things does keep getting easier. And the guidlines for how to be the CF manager are pretty nailed down now--you could just execute on a pretty mechanical plan and expect to make useful progress. It's still a lot of time though. I've never had an appreciation for exactly how many messages flow through this list like I do now, after a month of needing to read and pay attention to every single one of them. For those of you still furiously working on a patch with that deadline, if you have a large patch and it's not already been reviewed in a previous CommitFest, I wouldn't give you good odds of it being even looked at during that one. There doesn't seem to be any official warning of this where people will likely notice it, but this topic has been discussed on the list here. Large patches submitted just before the deadline for a release have not fared very well historically. Recognizing that, there's really no tolerance for chasing after them (at the expense of postponing the beta) left for this release. Just figured I'd pass along that warning before somebody discovers it the hard way, by working madly to finish their submission up only to see it get kicked to the next version anyway. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers