Re: [HACKERS] Query Plan Columns
David E. Wheeler da...@kineticode.com writes: I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core: EXPLAIN (format table) SELECT * FROM bar; +1 from me here, as it happens parsing a table-like resultset is already implemented everywhere :) I'd add another +1 if it were easily usable as a normal relation (or SRF) from inside a query, e.g. in a WITH table expression. WITH plan AS ( EXPLAIN (format table) SELECT * FROM bar ) SELECT some_plan_analysis_query_here; Or even: WITH plan AS ( EXPLAIN (format table) SELECT * FROM bar ) INSERT INTO plan_audit SELECT * FROM plan WHERE actual_total_time 12 * interval '100 ms'; Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query Plan Columns
On Nov 9, 2010, at 12:12 AM, Dimitri Fontaine wrote: WITH plan AS ( EXPLAIN (format table) SELECT * FROM bar ) INSERT INTO plan_audit SELECT * FROM plan WHERE actual_total_time 12 * interval '100 ms'; Yeah, that would be nice, but my current implementation has a row for each node, and a single explain can have many nodes. With this, you'd only get the top-level node (and not even that, as you didn't do EXPLAIN ANALYZE, so actual_total_time would be blank!). But I do like the idea… Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER can change t_len
On Tue, Nov 9, 2010 at 12:44 PM, Jeff Davis pg...@j-davis.com wrote: See case below. After the item length gets changed, then when reading the tuple later you get a t_len that includes padding. We can easily find it with pageinspect: \i pageinspect.sql create table foo(i int4); insert into foo values(1); SELECT lp, lp_len FROM heap_page_items(get_raw_page('foo', 0)); lp | lp_len + 1 | 28 VACUUM FULL foo; SELECT lp, lp_len FROM heap_page_items(get_raw_page('foo', 0)); lp | lp_len + 1 | 32 We should document in a comment that t_len can mean multiple things. Or, we should fix raw_heap_insert() to be consistent with the rest of the code, which doesn't MAXALIGN the t_len. We have a comment /* be conservative */ in the function, but I'm not sure we actually need the MAXALIGN. However, there would be almost no benefits to keep t_len in small value because we often treat memory in MAXALIGN unit. diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c index 0bd1865..0ed94ef 100644 *** a/src/backend/access/heap/rewriteheap.c --- b/src/backend/access/heap/rewriteheap.c *** raw_heap_insert(RewriteState state, Heap *** 586,592 else heaptup = tup; ! len = MAXALIGN(heaptup-t_len); /* be conservative */ /* * If we're gonna fail for oversize tuple, do it right away --- 586,592 else heaptup = tup; ! len = heaptup-t_len; /* * If we're gonna fail for oversize tuple, do it right away -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
Hey Pavel, 2010/11/9 Pavel Stehule pavel.steh...@gmail.com a) transformation to common type + simple - it is one day job - function record_to_array, array_to_record, and fieldnames_to_array - lost of type info, hidden problems with IO cast - int a := 10.0/2.0 is a problem using a plperl, pltcl is same like @a I thinking about some simple API, that can be based on transformation to text. It can be enough for almost all. * text[] = record_to_array(record) * table(id, key, datatype, value) = record_to_table(record) * text = record_get_field(record, text) * record = record_set_field(record, text, anyelement) ?? I personally like it. But I propose to add as well: integer := record_nkeys(); text := record_get_field(record, integer); Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- // Dmitriy.
Re: [HACKERS] CLUSTER can change t_len
On 09.11.2010 11:11, Itagaki Takahiro wrote: On Tue, Nov 9, 2010 at 12:44 PM, Jeff Davispg...@j-davis.com wrote: See case below. After the item length gets changed, then when reading the tuple later you get a t_len that includes padding. We can easily find it with pageinspect: \i pageinspect.sql create table foo(i int4); insert into foo values(1); SELECT lp, lp_len FROM heap_page_items(get_raw_page('foo', 0)); lp | lp_len + 1 | 28 VACUUM FULL foo; SELECT lp, lp_len FROM heap_page_items(get_raw_page('foo', 0)); lp | lp_len + 1 | 32 We should document in a comment that t_len can mean multiple things. Or, we should fix raw_heap_insert() to be consistent with the rest of the code, which doesn't MAXALIGN the t_len. We have a comment /* be conservative */ in the function, but I'm not sure we actually need the MAXALIGN. However, there would be almost no benefits to keep t_len in small value because we often treat memory in MAXALIGN unit. Hmm, the conservatism at that point affects the free space calculations. I'm not sure if it makes any difference in practice, but I'm also not sure it doesn't. pd_upper is always MAXALIGNed, but pd_lower is not. This would be more in line with what the main heap_insert code does: --- a/src/backend/access/heap/rewriteheap.c +++ b/src/backend/access/heap/rewriteheap.c @@ -641,7 +641,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup) } /* And now we can insert the tuple into the page */ - newoff = PageAddItem(page, (Item) heaptup-t_data, len, + newoff = PageAddItem(page, (Item) heaptup-t_data, heaptup-t_len, InvalidOffsetNumber, false, true); if (newoff == InvalidOffsetNumber) elog(ERROR, failed to add tuple); -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] security hooks on object creation
The attached patch provides plugin modules a hook just after object creation time. In typical use cases, it enables to assign default security labels on object creation by the external security providers. As Robert suggested before, it provides a generic purpose main hook. It takes an enum of ObjectAccessType which informs plugins what kind of accesses are required, and identifier of the object to be referenced. But, in this version, no additional information, such as new name in ALTER xxx RENAME TO, are not supported. The ObjectAccessType is defined as follows: typedef enum ObjectAccessType { OAT_POST_CREATE,/* Post creation fixups; such as security labeling */ } ObjectAccessType; We will support more complete kind of access types in the future version, however, we focus on default labeling rather than DDL permissions right now, so only OAT_POST_CREATE is defined here. Perhaps, we will add OAT_ALTER, OAT_DROP, OAT_COMMENT and so on. In this patch, I put hooks on the place just after creation of database objects that we can assign security labels. (schema, relation, attribute, procedure, language, type, large object) However, I didn't touch or move CommandCounterIncrement() yet, although we had a long discussion MVCC visibility of new object. Because I'm not clear whether it is really preferable to inject CCIs onto random points such as TypeCreate() or ProcedureCreate() under development of the version killed by myself. (In other words, it was simply ugly...) At least, we can see the new entries with SnapshotSelf, although we will pay performance penalty. If so, it is an idea not to touch anything related to CCIs. The purpose of post creation hooks are assignment of default security labels, not DDL permissions. So, it is not a bad idea not to touch routines related to CCIs in the earlier version of external security provider. In this patch, we put InvokeObjectAccessHook0 on the following functions. - heap_create_with_catalog() for relations/attributes - ATExecAddColumn() for attributes - NamespaceCreate() for schemas - ProcedureCreate() for aggregates/functions - TypeCreate() and TypeShellMake() for types - create_proc_lang() for procedural languages - inv_create() for large objects Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com *** a/src/backend/catalog/heap.c --- b/src/backend/catalog/heap.c *** *** 61,66 --- 61,67 #include storage/freespace.h #include storage/smgr.h #include utils/acl.h + #include utils/esp.h #include utils/builtins.h #include utils/fmgroids.h #include utils/inval.h *** *** 1189,1194 heap_create_with_catalog(const char *relname, --- 1190,1202 } /* + * If installed, ESP can assign initial properties (such as security + * labels) of the relation. + */ + InvokeObjectAccessHook0(OAT_POST_CREATE, + RelationRelationId, relid, 0); + + /* * Store any supplied constraints and defaults. * * NB: this may do a CommandCounterIncrement and rebuild the relcache *** a/src/backend/catalog/pg_namespace.c --- b/src/backend/catalog/pg_namespace.c *** *** 19,24 --- 19,25 #include catalog/indexing.h #include catalog/pg_namespace.h #include utils/builtins.h + #include utils/esp.h #include utils/rel.h #include utils/syscache.h *** *** 75,79 NamespaceCreate(const char *nspName, Oid ownerId) --- 76,87 /* Record dependency on owner */ recordDependencyOnOwner(NamespaceRelationId, nspoid, ownerId); + /* + * If installed, ESP can assign initial properties (such as security + * labels) of the new namespace. + */ + InvokeObjectAccessHook0(OAT_POST_CREATE, + NamespaceRelationId, nspoid, 0); + return nspoid; } *** a/src/backend/catalog/pg_proc.c --- b/src/backend/catalog/pg_proc.c *** *** 32,37 --- 32,38 #include tcop/pquery.h #include tcop/tcopprot.h #include utils/acl.h + #include utils/esp.h #include utils/builtins.h #include utils/lsyscache.h #include utils/syscache.h *** *** 614,619 ProcedureCreate(const char *procedureName, --- 615,627 nnewmembers, newmembers); } + /* + * If installed, ESP can assign initial properties (such as security + * labels) of the new function. + */ + InvokeObjectAccessHook0(OAT_POST_CREATE, + ProcedureRelationId, retval, 0); + heap_freetuple(tup); heap_close(rel, RowExclusiveLock); *** a/src/backend/catalog/pg_type.c --- b/src/backend/catalog/pg_type.c *** *** 26,31 --- 26,32 #include miscadmin.h #include parser/scansup.h #include utils/acl.h + #include utils/esp.h #include utils/builtins.h #include utils/fmgroids.h #include utils/lsyscache.h *** *** 156,161 TypeShellMake(const char *typeName, Oid typeNamespace, Oid ownerId) --- 157,169 false); /* + * If installed, ESP can assign initial properties (such as security +
Re: [HACKERS] W3C Specs: Web SQL
On Mon, Nov 08, 2010 at 12:55:22PM -0300, Alvaro Herrera wrote: Excerpts from Charles Pritchard's message of sáb nov 06 23:20:13 -0300 2010: Simple async sql sub-set (the spec in trouble): http://dev.w3.org/html5/webdatabase/ This is insane. This spec allows the server to run arbitrary SQL commands on the client, AFAICT. That seems like infinite joy for malicious people running webservers. The more powerful the dialect of SQL the client implements, the more dangerous it is. How is this different from the server asking the client to run an infinite loop in javascript? -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoid memory leaks during ANALYZE's compute_index_stats() ?
Hi Tom, thanks for brilliant analysis - now we know how to avoid the problem. As a side note: from the user's point of view it would be really nice to know that the error was caused by auto-ANALYZE - at least on 8.2 it's not that obvious from the server log. It was the first message with given backend PID so it seemed to me as it's problem during backend startup - we have log_connections to on... Thanks, Kuba Dne 9.11.2010 2:04, Tom Lane napsal(a): I looked into the out-of-memory problem reported by Jakub Ouhrabka here: http://archives.postgresql.org/pgsql-general/2010-11/msg00353.php It's pretty simple to reproduce, even in HEAD; what you need is an index expression that computes a bulky intermediate result. His example is md5(array_to_string(f1, ''::text)) where f1 is a bytea array occupying typically 15kB per row. Even though the final result of md5() is only 32 bytes, evaluation of this expression will eat about 15kB for the detoasted value of f1, roughly double that for the results of the per-element output function calls done inside array_to_string, and another 30k for the final result string of array_to_string. And *none of that gets freed* until compute_index_stats() is all done. In my testing, with the default stats target of 100, this gets repeated for 30k sample rows, requiring something in excess of 2GB in transient space. Jakub was using stats target 500 so it'd be closer to 10GB for him. AFAICS the only practical fix for this is to have the inner loop of compute_index_stats() copy each index expression value out of the per-tuple memory context and into the per-index Analyze Index context. That would allow it to reset the per-tuple memory context after each FormIndexDatum call and thus clean up whatever intermediate result trash the evaluation left behind. The extra copying is a bit annoying, since it would add cycles while accomplishing nothing useful for index expressions with no intermediate results, but I'm thinking this is a must-fix. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] W3C Specs: Web SQL
On 11/09/2010 09:59 AM, Alvaro Herrera wrote: And this makes me think that SQLite is indeed the right tool for the job here, and not PostgreSQL. If someone intrudes, it's going to be in the same process running the web browser, not in some server running under another user identity in the machine. That seems like a feature to me, not a bug. Right. Then it has some chance to be run in a sandbox. This doesn't strike me at all as a good fit for Postgres. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improved parallel make support
On Sat, Nov 6, 2010 at 4:35 AM, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-11-03 at 16:34 +0200, Peter Eisentraut wrote: On tis, 2010-11-02 at 10:21 -0400, Tom Lane wrote: Do we have a handle on how many buildfarm members this will break? I suppose we don't. One way to find out would be to commit just this bit +# We need the $(eval) function, which is available in GNU make 3.80. +# That also happens to be the version where the .VARIABLES variable +# was introduced, so this is a simple check. +ifndef .VARIABLES +$(error GNU make 3.80 or newer is required) +endif with a $(warning) instead, and let it run for a bit. So far, two machines have reported an older make version: dawn_bat narwhal both of the mingw type. Andrew, Dave, could you see about upgrading the GNU make installation there? Narwhal should be OK now. /D -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER can change t_len
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 09.11.2010 11:11, Itagaki Takahiro wrote: We have a comment /* be conservative */ in the function, but I'm not sure we actually need the MAXALIGN. However, there would be almost no benefits to keep t_len in small value because we often treat memory in MAXALIGN unit. Hmm, the conservatism at that point affects the free space calculations. I'm not sure if it makes any difference in practice, but I'm also not sure it doesn't. pd_upper is always MAXALIGNed, but pd_lower is not. I tend to agree with Jeff's original point that the behavior should match regular tuple insertion exactly. This isn't about saving space, because it won't; it's about not confusing readers by doing the same thing in randomly different ways. I will also note that the regular path is FAR better tested than raw_heap_insert. If there are any bugs here, it's 1000:1 they're in raw_heap_insert not the regular path. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
On Tue, 2010-11-09 at 13:34 +0200, Heikki Linnakangas wrote: (cleaning up my inbox, and bumped into this..) On 22.04.2010 12:31, Simon Riggs wrote: On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. Have you got an example of that? I don't need one, I have suggested the safe route. In order to infer anything, and thereby further optimise things, we would need proof that no cases can exist, which I don't have. Perhaps we can add yet, not sure about that either. It's good to be safe rather than sorry, but I'd still like to know because I'm quite surprised by that, and got me worried that I don't understand how hot standby works as well as I thought I did. I thought the point of stopping replay/killing queries at a b-tree deletion record is precisely that it makes some heap tuples invisible to running read-only queries. If it doesn't make any tuples invisible, why do any queries need to be killed? And why was it OK for them to be running just before replaying the b-tree deletion record? I'm sorry but I'm too busy to talk further on this today. Since we are discussing a further optimisation rather than a bug, I hope it is OK to come back to this again later. Would now be a good time to revisit this? I still don't see why a b-tree deletion record should conflict with anything, if all the removed index tuples point to just LP_DEAD tombstones in the heap. I want what you say to be true. The question is: is it? We just need to explain why that will never be a problem. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Mon, Nov 8, 2010 at 5:59 PM, Aidan Van Dyk ai...@highrise.ca wrote: The problem that putting checksums in a different place solves is the page layout (binary upgrade) problem. You're still doing to need to buffer the page as you calculate the checksum and write it out. buffering that page is absolutely necessary no mater where you put the checksum, unless you've got an exclusive lock that blocks even hint updates on the page. But buffering the page only means you've got some consistent view of the page. It doesn't mean the checksum will actually match the data in the page that gets written out. So when you read it back in the checksum may be invalid. I wonder if we could get by by having some global counter on the page which you increment when you set a hint bit. That way when we you read the page back in you could compare the counter on the page and the counter for the checksum and if the checksum counter is behind ignore the checksum? It would be nice to do better but I'm not sure we can. But if we can start using forks to put other data, that means that keeping the page layouts is easier, and thus binary upgrades are much more feasible. The difficulty with the page layout didn't come from the checksum itself. We can add 4 or 8 bytes to the page header easily enough. The difficulty came from trying to move the hint bits for all the tuples to a dedicated area. That means three resizable areas so either one of them would have to be relocatable or some other solution (like not checksumming the line pointers and putting the hint bits in the line pointers). If we're willing to have invalid checksums whenever the hint bits get set then this wouldn't be necessary. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] W3C Specs: Web SQL
Alvaro Herrera wrote: Excerpts from Charles Pritchard's message: I don't believe the webmaster is granted free rein: Disk quotas are enforced, data is separated per origin, hanging processes are up to the implementer, and postgres has plenty of settings for that. The day a privilege escalation is found and some webserver runs pg_read_file() on your browser, will be a sad one indeed. Personally, I feel somewhat more safe about trusting PostgreSQL on this than JavaScript, Java applets, a Flash plug-in, and cookies -- all of which are enabled in my browser. Sure, I occasionally hit an ill-behaved page and need to xkill my browser. I don't visit that site again. And it really doesn't happen to me very often. Can you can make a case that this proposal is more dangerous than having all the above enabled? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security hooks on object creation
(2010/11/09 20:34), Itagaki Takahiro wrote: 2010/11/9 KaiGai Koheikai...@ak.jp.nec.com: The attached patch provides plugin modules a hook just after object creation time. In typical use cases, it enables to assign default security labels on object creation by the external security providers. It looks like DDL Trigger on other database products. Do we need to consider both security hooks and DDL triggers now? Or, is it enough to design DLL triggers after the hooks are merged? Low-level hooks might be better for security providers because SQL-level triggers could be uninstall by superusers. An interesting viewpoint. Does the DDL trigger allow us to do something on CREATE/ALTER/DROP command? One thing we need to pay attention is that CREATE command is an exception from any other DDL commands, because the database object to be modified does not exist before the actual works. So, I'm saying we need both of prep/post creation hooks in the world of complete features. Meanwhile, I don't think we need security hooks post ALTER/DROP commands. Thus, we will put security hooks next to the existing permission checks, not after the actual works of these commands. Is it reasonable for DDL triggers (if it has something like BEFORE/AFTER)? Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp of the last replayed transaction
On Tue, Nov 9, 2010 at 1:05 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 8, 2010 at 6:00 AM, Fujii Masao masao.fu...@gmail.com wrote: On Sat, Nov 6, 2010 at 9:58 AM, Robert Haas robertmh...@gmail.com wrote: This looks good, but how about adding: if (!RecoveryInProgress()) PG_RETURN_NULL(); Otherwise, if we're in Hot Standby mode for a while and then enter normal running, wouldn't this still return a (stale) value? Yes, but isn't that (stale) value useful to check how far WAL records have been replayed, *after failover*? Oh, OK. I guess that makes sense. One other question - should we say pg_last_xact_replay_timestamp() rather than pg_xact_last_replay_timestamp(), for consistency with pg_last_xlog_replay_location()? Yeah, pg_last_xact_replay_timestamp was my first idea. But, for me (with poor English), that sounded to return the timestamp of when the last transaction has been replayed in the standby, rather than the timestamp of the last replayed transaction (i.e., when the last replayed transaction has been committed in the master). So I didn't choose that name. But, pg_last_xact_replay_timestamp is more intuitive for many people? If so, let's change the name. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_ctl init doc bug
Hi, While executing the following command I got: $ pg_ctl init -D /tmp/foo -o -N 11 /home/euler/pg/bin/initdb: invalid option -- N Try initdb --help for more information. pg_ctl: database system initialization failed I tried -N 11 (postgres option) after looking at the manual but the -o option only accept initdb options so I think there is a documentation bug. Patch is attached. While at it, is it worth adding a pg_ctl init example? -- Euler Taveira de Oliveira http://www.timbira.com/ diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml index 29bea24..0cc82cd 100644 *** a/doc/src/sgml/ref/pg_ctl-ref.sgml --- b/doc/src/sgml/ref/pg_ctl-ref.sgml *** PostgreSQL documentation *** 25,31 arg choice=plaininit[db]/arg arg-s/arg arg-D replaceabledatadir/replaceable/arg !arg-o replaceableoptions/replaceable/arg /cmdsynopsis cmdsynopsis --- 25,31 arg choice=plaininit[db]/arg arg-s/arg arg-D replaceabledatadir/replaceable/arg !arg-o replaceableinit-options/replaceable/arg /cmdsynopsis cmdsynopsis *** PostgreSQL documentation *** 263,268 --- 263,282 /varlistentry varlistentry + termoption-o replaceable class=parameterinit-options/replaceable/option/term + listitem +para + Specifies options to be passed directly to the + commandinitdb/command command. +/para +para + The options are usually surrounded by single or double + quotes to ensure that they are passed through as a group. +/para + /listitem + /varlistentry + + varlistentry termoption-m replaceable class=parametermode/replaceable/option/term listitem para -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can we tell how far behind the standby is?
On Tue, Nov 9, 2010 at 7:57 AM, Josh Berkus j...@agliodbs.com wrote: The patch which I'm proposing is helpful for you? http://archives.postgresql.org/pgsql-hackers/2010-11/msg00167.php Depends. Is that the timestamp on the master (when it was synced), or the timestamp on the standby (when it was replayed)? It's only helpful if it's the former. It's the former. The function which I'm proposing returns the timestamp of the last replayed commit/abort log record. That timestamp is given when the commit/abort log record is generated in the master. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 8:45 AM, Greg Stark gsst...@mit.edu wrote: But buffering the page only means you've got some consistent view of the page. It doesn't mean the checksum will actually match the data in the page that gets written out. So when you read it back in the checksum may be invalid. I was assuming that if the code went through the trouble to buffer the shared page to get a stable, non-changing copy to use for checksumming/writing it, it would write() the buffered copy it just made, not the original in shared memory... I'm not sure how that write could be in-consistent. a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 2:28 PM, Aidan Van Dyk ai...@highrise.ca wrote: On Tue, Nov 9, 2010 at 8:45 AM, Greg Stark gsst...@mit.edu wrote: But buffering the page only means you've got some consistent view of the page. It doesn't mean the checksum will actually match the data in the page that gets written out. So when you read it back in the checksum may be invalid. I was assuming that if the code went through the trouble to buffer the shared page to get a stable, non-changing copy to use for checksumming/writing it, it would write() the buffered copy it just made, not the original in shared memory... I'm not sure how that write could be in-consistent. Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while only having written half the buffer out then the checksum won't match. If the only changes on the page were hint bit updates then there will be no full page write in the WAL log to repair the block. It's possible that *that* situation is rare enough to let the checksum raise a warning but not an error. But personally I'm pretty loath to buffer every page write. The state of the art are zero-copy processes and we should be looking to reduce copies rather than increase them. Though I suppose if we did a zero-copy CRC that might actually get us this buffered write for free. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while only having written half the buffer out then the checksum won't match. If the only changes on the page were hint bit updates then there will be no full page write in the WAL log to repair the block. Huh, this implies that if we did go through all the work of segregating the hint bits and could arrange that they all appear on the same 512-byte sector and if we buffered them so that we were writing the same bits we checksummed then we actually *could* include them in the CRC after all since even a torn page will almost certainly not tear an individual sector. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER can change t_len
On 09.11.2010 15:57, Greg Stark wrote: On Tue, Nov 9, 2010 at 10:20 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: We have a comment /* be conservative */ in the function, but I'm not sure we actually need the MAXALIGN. However, there would be almost no benefits to keep t_len in small value because we often treat memory in MAXALIGN unit. Hmm, the conservatism at that point affects the free space calculations. I'm not sure if it makes any difference in practice, but I'm also not sure it doesn't. pd_upper is always MAXALIGNed, but pd_lower is not. This would be more in line with what the main heap_insert code does: Doesn't this cause assertion failures in heap_fill_tuple when the data size isn't what's expected? I guess we never actually use the t_len for later tuple reconstructions, we just recompute the needed size? Right, the length from t_len or the item pointer is never passed to heap_fill_tuple. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER can change t_len
On Tue, Nov 9, 2010 at 10:20 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: We have a comment /* be conservative */ in the function, but I'm not sure we actually need the MAXALIGN. However, there would be almost no benefits to keep t_len in small value because we often treat memory in MAXALIGN unit. Hmm, the conservatism at that point affects the free space calculations. I'm not sure if it makes any difference in practice, but I'm also not sure it doesn't. pd_upper is always MAXALIGNed, but pd_lower is not. This would be more in line with what the main heap_insert code does: Doesn't this cause assertion failures in heap_fill_tuple when the data size isn't what's expected? I guess we never actually use the t_len for later tuple reconstructions, we just recompute the needed size? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TODO Alter Table Rename Constraint
Hi Everybody! I looked up this todo, and figured out a plan, how the implementation could be written. The main challenge is to keep constraints and indexes (for unique and PK constraints) consistent. Fortunately this is already realized in the case of an index. So at ALTER INDEX RENAME the consistency is given by an extra check in *tablecmds.c (Line 2246)*, where a function is finally called,* RenameConstraintById (pg_constraint.c Line 604)*. My idea is, to do that analog for ALTER CONSTRAINT RENAME too. So renaming the constraint is going to be done in *tablecmds.c* as for indexes, tables, sequences, views. And after checking whether the renametype is constraint, an extra rename has to be done for the index. Getting the index can be done with the function *get_constraint_index (pg_depend.c Line 475)*. Now it should be possible to do the same as in *RenameConstraintById.* Is that so legal? Is anything else to be considered? I appreciate any suggestion, and maybe some help too, as I'm not so familiar with the source. Thanks in advance, Viktor (Student of technical unverstity of Vienna)
Re: [HACKERS] CLUSTER can change t_len
On 09.11.2010 17:14, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 09.11.2010 11:11, Itagaki Takahiro wrote: We have a comment /* be conservative */ in the function, but I'm not sure we actually need the MAXALIGN. However, there would be almost no benefits to keep t_len in small value because we often treat memory in MAXALIGN unit. Hmm, the conservatism at that point affects the free space calculations. I'm not sure if it makes any difference in practice, but I'm also not sure it doesn't. pd_upper is always MAXALIGNed, but pd_lower is not. I tend to agree with Jeff's original point that the behavior should match regular tuple insertion exactly. This isn't about saving space, because it won't; it's about not confusing readers by doing the same thing in randomly different ways. I will also note that the regular path is FAR better tested than raw_heap_insert. If there are any bugs here, it's 1000:1 they're in raw_heap_insert not the regular path. Agreed. I've committed my patch to make it behave like heap_insert. Thank you, Itagaki, for the easy test case using pageinspect. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Nov 9, 2010, at 9:27 AM, Greg Stark wrote: On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while only having written half the buffer out then the checksum won't match. If the only changes on the page were hint bit updates then there will be no full page write in the WAL log to repair the block. Huh, this implies that if we did go through all the work of segregating the hint bits and could arrange that they all appear on the same 512-byte sector and if we buffered them so that we were writing the same bits we checksummed then we actually *could* include them in the CRC after all since even a torn page will almost certainly not tear an individual sector. If there's a torn page then we've crashed, which means we go through crash recovery, which puts a valid page (with valid CRC) back in place from the WAL. What am I missing? BTW, I agree that at minimum we need to leave the option of only raising a warning when we hit a checksum failure. Some people might want Postgres to treat it as an error by default, but most folks will at least want the option to look at their (corrupt) data. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: There are also crosschecks that you can apply: if it's a heap page, are there any index pages with pointers to it? If it's an index page, are there downlink or sibling links to it from elsewhere in the index? A page that Postgres left as zeroes would not have any references to it. IMO there are a lot of methods that can separate filesystem misfeasance from Postgres errors, probably with greater reliability than this hack. I would also suggest that you don't really need to prove conclusively that any particular instance is one or the other --- a pattern across multiple instances will tell you what you want to know. Doing this postmortem on a regular deployment and fixing the problem would not be too difficult. But this platform, which Postgres is a part of, would be mostly left unattended once deployed (pardon me for not sharing the details, as I am not sure if I can). An external HA component is supposed to detect any problems (by querying Postgres or by external means) and take an evasive action. It is this automation of problem detection that we are seeking. As Greg pointed out, even with this hack in place, we might still get zero pages from the FS (say, when ext3 does metadata journaling but not block journaling). In that case we'd rely on recovery's WAL replay of relation extension to reintroduce the magic number in pages. What's more, if I did believe that this was a safe and reliable technique, I'd be unhappy about the opportunity cost of reserving it for zero-page testing rather than other purposes. This is one of those times where you are a bit too terse for me. What does zero-page imply that this hack wouldn't? Regards, -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.EnterpriseDB.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
[HACKERS] DROP TABLESPACE needs crash-resistance
We are facing a problem in dropping a tablespace after crash recovery. The recovery starts from the last checkpoint, but the tables that were created by a transaction in a tablespace before the checkpoint are still lying around; the transaction had not finished by the time of crash. After recovery, when the app tries to drop the tablespace, the command fails because the tablespace directory is not empty. Solving this problem has become quite critical since the the platform where Postgres is being used is supposed to run unattended. The problem is currently being solved by an application specific kluge, which is highly undesirable as this kluge might not work as the application evolves. Has this problem been reported/discussed earlier? Any suggestions to avoid this situation? I have a hackish idea of listing files created by yet-to-be-committed transactions be listed after every checkpoint so that the recovery code can remember to remove such files if the creating transaction's commit record is not encountered until end of recovery. But this would require every smgrcreate() to be communicated to the BGWriter, and somehow make BGWriter forget this list when the transaction commits. Regards, -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.EnterpriseDB.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 4:26 PM, Jim Nasby j...@nasby.net wrote: On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while only having written half the buffer out then the checksum won't match. If the only changes on the page were hint bit updates then there will be no full page write in the WAL log to repair the block. If there's a torn page then we've crashed, which means we go through crash recovery, which puts a valid page (with valid CRC) back in place from the WAL. What am I missing? If the only changes on the page were hint bit updates then there will be no full page write in the WAL to repair the block -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 11:26 AM, Jim Nasby j...@nasby.net wrote: Huh, this implies that if we did go through all the work of segregating the hint bits and could arrange that they all appear on the same 512-byte sector and if we buffered them so that we were writing the same bits we checksummed then we actually *could* include them in the CRC after all since even a torn page will almost certainly not tear an individual sector. If there's a torn page then we've crashed, which means we go through crash recovery, which puts a valid page (with valid CRC) back in place from the WAL. What am I missing? The problem case is where hint-bits have been set. Hint bits have always been we don't really care, but we write them. A torn-page on hint-bit-only writes is ok, because with a torn page (assuming you dont' get zero-ed pages), you get the old or new chunks of the complete 8K buffer, but they are identical except for only hint-bits, which eiterh the old or new state is sufficient. But with a check-sum, now, getting a torn page w/ only hint-bit updates now becomes noticed. Before, it might have happened, but we wouldn't have noticed or cared. So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the buffer tight) 2) saving hint-bits on an otherwise unchanged page. We either need to just not write that page, and loose the work the hint-bits did, or do a full-page WAL of it, so the torn-page checksum is fixed Both of these are theoretical performance tradeoffs. How badly do we want to verify on read that it is *exactly* what we thought we wrote? a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
On Nov 9, 2010, at 1:38 AM, Dmitriy Igrishin wrote: * text[] = record_to_array(record) * table(id, key, datatype, value) = record_to_table(record) * text = record_get_field(record, text) * record = record_set_field(record, text, anyelement) ?? I personally like it. But I propose to add as well: integer := record_nkeys(); text := record_get_field(record, integer); You realize you can pretty much do all this with hstore, right? hstore hash := hstore(record); arytext[] := hstore_to_matrix(hash); select * from each(hash); -- TABLE text foo := hash - somekey; hash := hash || 'key,value'::hstore; record := populate_record(record, hash); Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Gurjeet Singh singh.gurj...@gmail.com writes: On Tue, Nov 9, 2010 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: IMO there are a lot of methods that can separate filesystem misfeasance from Postgres errors, probably with greater reliability than this hack. Doing this postmortem on a regular deployment and fixing the problem would not be too difficult. But this platform, which Postgres is a part of, would be mostly left unattended once deployed (pardon me for not sharing the details, as I am not sure if I can). An external HA component is supposed to detect any problems (by querying Postgres or by external means) and take an evasive action. It is this automation of problem detection that we are seeking. To be blunt, this argument is utter nonsense. The changes you propose would still require manual analysis of any detected issues in order to do anything useful about them. Once you know that there is, or isn't, a filesystem-level error involved, what are you going to do next? You're going to go try to debug the component you know is at fault, that's what. And that problem is still AI-complete. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
Hey David, 2010/11/9 David E. Wheeler da...@kineticode.com On Nov 9, 2010, at 1:38 AM, Dmitriy Igrishin wrote: * text[] = record_to_array(record) * table(id, key, datatype, value) = record_to_table(record) * text = record_get_field(record, text) * record = record_set_field(record, text, anyelement) ?? I personally like it. But I propose to add as well: integer := record_nkeys(); text := record_get_field(record, integer); You realize you can pretty much do all this with hstore, right? hstore hash := hstore(record); arytext[] := hstore_to_matrix(hash); select * from each(hash); -- TABLE text foo := hash - somekey; hash := hash || 'key,value'::hstore; record := populate_record(record, hash); Yep, but hstore is an additional module. Although, its not a problem. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- // Dmitriy.
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
On Nov 9, 2010, at 9:18 AM, Dmitriy Igrishin wrote: Yep, but hstore is an additional module. Although, its not a problem. Yeah, but JSON will be in core, and with luck, before long, it will have the same (or similar) capabilities. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DROP TABLESPACE needs crash-resistance
Gurjeet Singh singh.gurj...@gmail.com writes: We are facing a problem in dropping a tablespace after crash recovery. The recovery starts from the last checkpoint, but the tables that were created by a transaction in a tablespace before the checkpoint are still lying around; the transaction had not finished by the time of crash. After recovery, when the app tries to drop the tablespace, the command fails because the tablespace directory is not empty. Hmm. The reason DROP TABLESPACE fails in that case, rather than just arbitrarily rm -rf'ing the files, is fear of deleting valuable data by accident. I suppose we could have a mode that deletes the files without any manual intervention, but personally I'd regard that as a foot-gun. Solving this problem has become quite critical since the the platform where Postgres is being used is supposed to run unattended. I'm not entirely clear as to the use-case for unattended DROP TABLESPACE? That doesn't really seem like an operation you should need on a routine basis. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the buffer tight) 2) saving hint-bits on an otherwise unchanged page. We either need to just not write that page, and loose the work the hint-bits did, or do a full-page WAL of it, so the torn-page checksum is fixed Actually the consensus the last go-around on this topic was to segregate the hint bits into a single area of the page and skip them in the checksum. That way we don't have to do any of the above. It's just that that's a lot of work. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
David E. Wheeler da...@kineticode.com writes: You realize you can pretty much do all this with hstore, right? Yeah. Anything that involves smashing all the fields to text is not really an advance over (a) hstore or (b) using plperl or one of the other weakly-typed PLs. I think there's a fairly fundamental contradiction involved here. One of the basic design attributes of plpgsql is that it's strongly typed. Sometimes that's a blessing, and sometimes it's not, but it's a fact. There really isn't a good way to deal with run-time field selection while still maintaining strong typing. I do not believe that the answer to that problem is so let's break strong typing. Rather, the answer is that if that's what you need, you need to use a different tool. There's a reason we support multiple PLs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
2010/11/9 David E. Wheeler da...@kineticode.com: On Nov 9, 2010, at 1:38 AM, Dmitriy Igrishin wrote: * text[] = record_to_array(record) * table(id, key, datatype, value) = record_to_table(record) * text = record_get_field(record, text) * record = record_set_field(record, text, anyelement) ?? I personally like it. But I propose to add as well: integer := record_nkeys(); text := record_get_field(record, integer); You realize you can pretty much do all this with hstore, right? hstore has similar functionality, but missing a some details and add lot of other functionality - it doesn't identify type of field. Personally - it is nothing what I like - but can be better than nothing. Pavel hstore hash := hstore(record); ary text[] := hstore_to_matrix(hash); select * from each(hash); -- TABLE text foo := hash - somekey; hash := hash || 'key,value'::hstore; record := populate_record(record, hash); Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote: You realize you can pretty much do all this with hstore, right? hstore has similar functionality, but missing a some details and add lot of other functionality - it doesn't identify type of field. Personally - it is nothing what I like - but can be better than nothing. The JSON data type will give you some basic types (text, number, boolean). Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
2010/11/9 Tom Lane t...@sss.pgh.pa.us: David E. Wheeler da...@kineticode.com writes: You realize you can pretty much do all this with hstore, right? Yeah. Anything that involves smashing all the fields to text is not really an advance over (a) hstore or (b) using plperl or one of the other weakly-typed PLs. I think there's a fairly fundamental contradiction involved here. One of the basic design attributes of plpgsql is that it's strongly typed. Sometimes that's a blessing, and sometimes it's not, but it's a fact. There really isn't a good way to deal with run-time field selection while still maintaining strong typing. I do not believe that the answer to that problem is so let's break strong typing. Rather, the answer is that if that's what you need, you need to use a different tool. There's a reason we support multiple PLs. yes - I know these arguments well. But you have to know so any combination of PL increase a project complexity and increase a price for maintaining, installation, Now It's relative safe to say to somebody - you need a plpgsql. But it's more difficult to say same about plperl, pltcl, plpython - I like plperl too much, but I would to use it for untrusted operation and not for some very simple and general task. Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
On Nov 9, 2010, at 9:34 AM, Tom Lane wrote: I think there's a fairly fundamental contradiction involved here. One of the basic design attributes of plpgsql is that it's strongly typed. Sometimes that's a blessing, and sometimes it's not, but it's a fact. There really isn't a good way to deal with run-time field selection while still maintaining strong typing. I do not believe that the answer to that problem is so let's break strong typing. Rather, the answer is that if that's what you need, you need to use a different tool. There's a reason we support multiple PLs. Yeah, I think that Pavel wants to get at the record values with their types in tact. Not sure how that'd work though, really. If you know the type of the record columns already, you can just get them. But I'm not sure how you could introspect the column names and their types, and then get those types out without casting, except perhaps via EXECUTE… Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote: hstore has similar functionality, but missing a some details and add lot of other functionality - it doesn't identify type of field. Personally - it is nothing what I like - but can be better than nothing. What are you going to do with the type once you have it? David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
2010/11/9 David E. Wheeler da...@kineticode.com: On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote: You realize you can pretty much do all this with hstore, right? hstore has similar functionality, but missing a some details and add lot of other functionality - it doesn't identify type of field. Personally - it is nothing what I like - but can be better than nothing. The JSON data type will give you some basic types (text, number, boolean). yes, but it's same - transformation via text, and still it's more than less hack - when you like to iterate over record, then you need to transform record (generic and basic type) to JSON and then to text. It's nothing nice - and I don't see a difference between a use a Plperl or JSON. Pavel Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
2010/11/9 David E. Wheeler da...@kineticode.com: On Nov 9, 2010, at 9:34 AM, Tom Lane wrote: I think there's a fairly fundamental contradiction involved here. One of the basic design attributes of plpgsql is that it's strongly typed. Sometimes that's a blessing, and sometimes it's not, but it's a fact. There really isn't a good way to deal with run-time field selection while still maintaining strong typing. I do not believe that the answer to that problem is so let's break strong typing. Rather, the answer is that if that's what you need, you need to use a different tool. There's a reason we support multiple PLs. Yeah, I think that Pavel wants to get at the record values with their types in tact. Not sure how that'd work though, really. If you know the type of the record columns already, you can just get them. But I'm not sure how you could introspect the column names and their types, and then get those types out without casting, except perhaps via EXECUTE… every variable in plpgsql has known type descriptor. So it's not a problem. The problem is a access to fields in cycle - and then you need a simple trick like polymorphic parameters. Pavel Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
2010/11/9 David E. Wheeler da...@kineticode.com: On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote: hstore has similar functionality, but missing a some details and add lot of other functionality - it doesn't identify type of field. Personally - it is nothing what I like - but can be better than nothing. What are you going to do with the type once you have it? for example, you can use it for formatting, for explicit cast, for different serialization type - like JSON - without knowledge of type, you can't to build correct JSON value. So you can write a application with knowledge of type and you don't need to detect type from value, that isn't robust. Pavel David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
(cleaning up my inbox, and bumped into this..) On 22.04.2010 12:31, Simon Riggs wrote: On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. Have you got an example of that? I don't need one, I have suggested the safe route. In order to infer anything, and thereby further optimise things, we would need proof that no cases can exist, which I don't have. Perhaps we can add yet, not sure about that either. It's good to be safe rather than sorry, but I'd still like to know because I'm quite surprised by that, and got me worried that I don't understand how hot standby works as well as I thought I did. I thought the point of stopping replay/killing queries at a b-tree deletion record is precisely that it makes some heap tuples invisible to running read-only queries. If it doesn't make any tuples invisible, why do any queries need to be killed? And why was it OK for them to be running just before replaying the b-tree deletion record? I'm sorry but I'm too busy to talk further on this today. Since we are discussing a further optimisation rather than a bug, I hope it is OK to come back to this again later. Would now be a good time to revisit this? I still don't see why a b-tree deletion record should conflict with anything, if all the removed index tuples point to just LP_DEAD tombstones in the heap. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security hooks on object creation
2010/11/9 KaiGai Kohei kai...@ak.jp.nec.com: The attached patch provides plugin modules a hook just after object creation time. In typical use cases, it enables to assign default security labels on object creation by the external security providers. It looks like DDL Trigger on other database products. Do we need to consider both security hooks and DDL triggers now? Or, is it enough to design DLL triggers after the hooks are merged? Low-level hooks might be better for security providers because SQL-level triggers could be uninstall by superusers. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
What are you going to do with the type once you have it? for example, you can use it for formatting, for explicit cast, for different serialization type - like JSON - without knowledge of type, you can't to build correct JSON value. So you can write a application with knowledge of type and you don't need to detect type from value, that isn't robust. there is other disadvantage of access to fields via PL function (or native function) based on transformation to text. It's a lost of typmod. Pavel Pavel David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] W3C Specs: Web SQL
Excerpts from Sam Mason's message of mar nov 09 08:06:12 -0300 2010: On Mon, Nov 08, 2010 at 12:55:22PM -0300, Alvaro Herrera wrote: Excerpts from Charles Pritchard's message of sáb nov 06 23:20:13 -0300 2010: Simple async sql sub-set (the spec in trouble): http://dev.w3.org/html5/webdatabase/ This is insane. This spec allows the server to run arbitrary SQL commands on the client, AFAICT. That seems like infinite joy for malicious people running webservers. The more powerful the dialect of SQL the client implements, the more dangerous it is. How is this different from the server asking the client to run an infinite loop in javascript? So we already failed :-) It seems that being able to kill processes is seen as good enough ... well, I guess I just don't visit many malicious sites. And this makes me think that SQLite is indeed the right tool for the job here, and not PostgreSQL. If someone intrudes, it's going to be in the same process running the web browser, not in some server running under another user identity in the machine. That seems like a feature to me, not a bug. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl init doc bug
Euler Taveira de Oliveira eu...@timbira.com writes: I tried -N 11 (postgres option) after looking at the manual but the -o option only accept initdb options so I think there is a documentation bug. Patch is attached. Yeah, the init option was squeezed in later with only minimal attention to documentation. Agree with your idea of treating its -o option as taking a different kind of argument. Will apply. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How can we tell how far behind the standby is?
It's the former. The function which I'm proposing returns the timestamp of the last replayed commit/abort log record. That timestamp is given when the commit/abort log record is generated in the master. That would be *extremely* helpful for all kinds of monitoring tools. Please complete/submit this. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE recursion to typed tables
Here is the patch that adds [RESTRICT|CASCADE] to ALTER TYPE ... ADD/ALTER/DROP/RENAME ATTRIBUTE, so that recurses to typed tables. diff --git a/doc/src/sgml/ref/alter_type.sgml b/doc/src/sgml/ref/alter_type.sgml index 90de2e8..04395c9 100644 --- a/doc/src/sgml/ref/alter_type.sgml +++ b/doc/src/sgml/ref/alter_type.sgml @@ -26,15 +26,15 @@ PostgreSQL documentation ALTER TYPE replaceable class=PARAMETERname/replaceable replaceable class=PARAMETERaction/replaceable [, ... ] ALTER TYPE replaceable class=PARAMETERname/replaceable OWNER TO replaceable class=PARAMETERnew_owner/replaceable ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable TO replaceable class=PARAMETERnew_attribute_name/replaceable -ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME TO replaceable class=PARAMETERnew_name/replaceable +ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME TO replaceable class=PARAMETERnew_name/replaceable [ CASCADE | RESTRICT ] ALTER TYPE replaceable class=PARAMETERname/replaceable SET SCHEMA replaceable class=PARAMETERnew_schema/replaceable ALTER TYPE replaceable class=PARAMETERname/replaceable ADD replaceable class=PARAMETERnew_enum_value/replaceable [ { BEFORE | AFTER } replaceable class=PARAMETERexisting_enum_value/replaceable ] phrasewhere replaceable class=PARAMETERaction/replaceable is one of:/phrase -ADD ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable replaceable class=PARAMETERdata_type/replaceable -DROP ATTRIBUTE [ IF EXISTS ] replaceable class=PARAMETERattribute_name/replaceable -ALTER ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable [ SET DATA ] TYPE replaceable class=PARAMETERdata_type/replaceable +ADD ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable replaceable class=PARAMETERdata_type/replaceable [ CASCADE | RESTRICT ] +DROP ATTRIBUTE [ IF EXISTS ] replaceable class=PARAMETERattribute_name/replaceable [ CASCADE | RESTRICT ] +ALTER ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable [ SET DATA ] TYPE replaceable class=PARAMETERdata_type/replaceable [ CASCADE | RESTRICT ] /synopsis /refsynopsisdiv @@ -116,6 +116,26 @@ ALTER TYPE replaceable class=PARAMETERname/replaceable ADD replaceable cl /para /listitem /varlistentry + + varlistentry +termliteralCASCADE/literal/term +listitem + para + Automatically propagate the operation to typed tables of the + type being altered. + /para +/listitem + /varlistentry + + varlistentry +termliteralRESTRICT/literal/term +listitem + para + Refuse the operation if the type being altered is the type of a + typed table. This is the default. + /para +/listitem + /varlistentry /variablelist /para diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c index 794d355..0d0227d 100644 --- a/src/backend/commands/alter.c +++ b/src/backend/commands/alter.c @@ -125,11 +125,7 @@ ExecRenameStmt(RenameStmt *stmt) } case OBJECT_COLUMN: case OBJECT_ATTRIBUTE: - renameatt(relid, - stmt-subname, /* old att name */ - stmt-newname, /* new att name */ - interpretInhOption(stmt-relation-inhOpt), /* recursive? */ - 0); /* expected inhcount */ + renameatt(relid, stmt); break; case OBJECT_TRIGGER: renametrig(relid, diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 6ec8a85..2b35943 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -269,8 +269,11 @@ static void ATSimpleRecursion(List **wqueue, Relation rel, AlterTableCmd *cmd, bool recurse, LOCKMODE lockmode); static void ATOneLevelRecursion(List **wqueue, Relation rel, AlterTableCmd *cmd, LOCKMODE lockmode); -static void find_typed_table_dependencies(Oid typeOid, const char *typeName); -static void ATPrepAddColumn(List **wqueue, Relation rel, bool recurse, +static void ATTypedTableRecursion(List **wqueue, Relation rel, AlterTableCmd *cmd, + LOCKMODE lockmode); +static List *find_typed_table_dependencies(Oid typeOid, const char *typeName, + DropBehavior behavior); +static void ATPrepAddColumn(List **wqueue, Relation rel, bool recurse, bool recursing, AlterTableCmd *cmd, LOCKMODE lockmode); static void ATExecAddColumn(AlteredTableInfo *tab, Relation rel, ColumnDef *colDef, bool isOid, LOCKMODE lockmode); @@ -290,7 +293,8 @@ static void ATExecSetOptions(Relation rel, const char *colName, Node *options, bool isReset, LOCKMODE lockmode); static void ATExecSetStorage(Relation rel, const char *colName, Node *newValue, LOCKMODE lockmode); -static void ATPrepDropColumn(Relation rel, bool recurse, AlterTableCmd *cmd); +static void ATPrepDropColumn(List **wqueue, Relation rel, bool recurse, bool recursing, + AlterTableCmd *cmd, LOCKMODE
Re: [HACKERS] timestamp of the last replayed transaction
On 11/9/10 5:44 AM, Fujii Masao wrote: But, pg_last_xact_replay_timestamp is more intuitive for many people? If so, let's change the name. *None* of these names are intuitive. So let's just go for consistency. If you want an intuitive name, it would be: pg_replication_log_timestamp() -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the buffer tight) 2) saving hint-bits on an otherwise unchanged page. We either need to just not write that page, and loose the work the hint-bits did, or do a full-page WAL of it, so the torn-page checksum is fixed Actually the consensus the last go-around on this topic was to segregate the hint bits into a single area of the page and skip them in the checksum. That way we don't have to do any of the above. It's just that that's a lot of work. And it still allows silent data corruption, because bogusly clearing a hint bit is, at the moment, harmless, but bogusly setting one is not. I really have to wonder how other products handle this. PostgreSQL isn't the only database product that uses MVCC - not by a long shot - and the problem of detecting whether an XID is visible to the current snapshot can't be ours alone. So what do other people do about this? They either don't cache the information about whether the XID is committed in-page (in which case, are they just slower or do they have some other means of avoiding the performance hit?) or they cache it in the page (in which case, they either WAL log it or they don't checksum it). I mean, there aren't any other options, are there? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 09, 2010 at 02:05:57PM -0500, Robert Haas wrote: On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the buffer tight) 2) saving hint-bits on an otherwise unchanged page. ?We either need to just not write that page, and loose the work the hint-bits did, or do a full-page WAL of it, so the torn-page checksum is fixed Actually the consensus the last go-around on this topic was to segregate the hint bits into a single area of the page and skip them in the checksum. That way we don't have to do any of the above. It's just that that's a lot of work. And it still allows silent data corruption, because bogusly clearing a hint bit is, at the moment, harmless, but bogusly setting one is not. I really have to wonder how other products handle this. PostgreSQL isn't the only database product that uses MVCC - not by a long shot - and the problem of detecting whether an XID is visible to the current snapshot can't be ours alone. So what do other people do about this? They either don't cache the information about whether the XID is committed in-page (in which case, are they just slower or do they have some other means of avoiding the performance hit?) or they cache it in the page (in which case, they either WAL log it or they don't checksum it). I mean, there aren't any other options, are there? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company That would imply that we need to have a CRC for just the hint bit section or some type of ECC calculation that can detect bad hint bits independent of the CRC for the rest of the page. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Excerpts from Robert Haas's message of mar nov 09 16:05:57 -0300 2010: And it still allows silent data corruption, because bogusly clearing a hint bit is, at the moment, harmless, but bogusly setting one is not. I really have to wonder how other products handle this. PostgreSQL isn't the only database product that uses MVCC - not by a long shot - and the problem of detecting whether an XID is visible to the current snapshot can't be ours alone. So what do other people do about this? They either don't cache the information about whether the XID is committed in-page (in which case, are they just slower or do they have some other means of avoiding the performance hit?) or they cache it in the page (in which case, they either WAL log it or they don't checksum it). I mean, there aren't any other options, are there? Maybe allocate enough shared memory for pg_clog buffers back to the freeze horizon, and just don't use hint bits? Maybe some intermediate solution, i.e. allocate a large bunch of pg_clog buffers, and do WAL-logged setting of hint bits only for tuples that go further back. I remember someone had a patch to set all the bits in a page that passed a threshold of some kind. Ah, no, that was for freezing tuples. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO Alter Table Rename Constraint
On Tue, Nov 9, 2010 at 10:50 AM, Viktor Valy vili0...@gmail.com wrote: Hi Everybody! I looked up this todo, and figured out a plan, how the implementation could be written. The main challenge is to keep constraints and indexes (for unique and PK constraints) consistent. Fortunately this is already realized in the case of an index. So at ALTER INDEX RENAME the consistency is given by an extra check in tablecmds.c (Line 2246), where a function is finally called, RenameConstraintById (pg_constraint.c Line 604). My idea is, to do that analog for ALTER CONSTRAINT RENAME too. So renaming the constraint is going to be done in tablecmds.c as for indexes, tables, sequences, views. And after checking whether the renametype is constraint, an extra rename has to be done for the index. Getting the index can be done with the function get_constraint_index (pg_depend.c Line 475). Now it should be possible to do the same as in RenameConstraintById. Is that so legal? Is anything else to be considered? I think the biggest problem is handling inherited tables properly, especially in complex inheritance hierarchies where there are multiple, separate paths from the top of the hierarchy to the bottom. See here for a couple of relevant test cases: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01570.php I believe that the rename needs to fail if any table in the inheritance hierarchy rooted at the target table also inherits the constraint from someplace outside that hierarchy; or if any table in that hierarchy has a local copy of the constraint that got merged with the inherited one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
PostgreSQL isn't the only database product that uses MVCC - not by a long shot - and the problem of detecting whether an XID is visible to the current snapshot can't be ours alone. So what do other people do about this? They either don't cache the information about whether the XID is committed in-page (in which case, are they just slower or do they have some other means of avoiding the performance hit?) or they cache it in the page (in which case, they either WAL log it or they don't checksum it). Well, most of the other MVCC-in-table DBMSes simply don't deal with large, on-disk databases. In fact, I can't think of one which does, currently; while MVCC has been popular for the New Databases, they're all focused on in-memory databases. Oracle and InnoDB use rollback segments. Might be worth asking the BDB folks. Personally, I think we're headed inevitably towards having a set of metadata bitmaps for each table, like we do currently for the FSM. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
On Tue, Nov 9, 2010 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote: David E. Wheeler da...@kineticode.com writes: You realize you can pretty much do all this with hstore, right? Yeah. Anything that involves smashing all the fields to text is not really an advance over (a) hstore or (b) using plperl or one of the other weakly-typed PLs. I think there's a fairly fundamental contradiction involved here. One of the basic design attributes of plpgsql is that it's strongly typed. Sometimes that's a blessing, and sometimes it's not, but it's a fact. There really isn't a good way to deal with run-time field selection while still maintaining strong typing. I do not believe that the answer to that problem is so let's break strong typing. Rather, the answer is that if that's what you need, you need to use a different tool. There's a reason we support multiple PLs. In cases where both the field layout and the field of interest are known at plan time this not violating the static principles of plpgsql. Suppose we decided to access field by name via recordvar{name} or recordvar{field pos}: DECLARE r record; f foo; t text default 'id'; BEGIN some code r{'id'} = 5; -- no good, r is dynamic record f{t} 5; -- no good, t is not immutable f{'id'} = 5; -- ok; Iterating over fields of type foo is not interesting because fields are already known to whoever is writing the function, and flatten to text cases are already covered. IOW, the above syntax is not really useful because you can just do: f.id = 5; The only exception I see is in trigger functions. If the trigger function plan is specific to the firing trigger, new and old are defined at plan time, so something like: new{TG_FIELDNAMES[1]} = 5; -- is ok (at least IMO), since TG_FIELDNAMES is immutable (at least to the plan). I don't honestly know if it's worth it -- the main case is performance (plus Pavel's secondary argument of loss of type information). Something like this would address an awful lot of gripes about trigger functions though. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE CONSTRAINT TRIGGER
Excerpts from Richard Broersma's message of vie nov 05 18:54:54 -0300 2010: On Fri, Nov 5, 2010 at 2:37 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote: Recent developments have turned it back into non-deprecated mode; it's not going anywhere, and it needs to be fully documented. From what I recall, there isn't anything in the trigger documentation or CREATE CONSTRAINT TRIGGER documentation that says the trigger function must explicitly raise an exception to create the notification that the custom constraint was violated. Would this be a good place for it? I added a sentence about this, and pushed it. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - iteration over fields of rec or row variable
Excerpts from Merlin Moncure's message of mar nov 09 16:41:32 -0300 2010: The only exception I see is in trigger functions. If the trigger function plan is specific to the firing trigger, new and old are defined at plan time, so something like: new{TG_FIELDNAMES[1]} = 5; -- is ok (at least IMO), since TG_FIELDNAMES is immutable (at least to the plan). I don't honestly know if it's worth it -- the main case is performance (plus Pavel's secondary argument of loss of type information). Something like this would address an awful lot of gripes about trigger functions though. I think the interesting bit (wrt the examples I've seen, that is) would be to be able to use the TG_ARGS array as the element specifier. Not sure if this is any different from your example. It's been some time since I've been near this though, so maybe what I wanted is now possible with USING tricks. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 7:37 PM, Josh Berkus j...@agliodbs.com wrote: Well, most of the other MVCC-in-table DBMSes simply don't deal with large, on-disk databases. In fact, I can't think of one which does, currently; while MVCC has been popular for the New Databases, they're all focused on in-memory databases. Oracle and InnoDB use rollback segments. Well rollback segments are still MVCC. However Oracle's MVCC is block-based. So they only have to do the visibility check once per block, not once per row. Once they find the right block version they can process all the rows on it. Also Oracle's snapshots are just the log position. Instead of having to check whether every transaction committed or not, they just find the block version which was last modified before the log position for when their transaction started. Might be worth asking the BDB folks. Personally, I think we're headed inevitably towards having a set of metadata bitmaps for each table, like we do currently for the FSM. Well we already have a metadata bitmap for transaction visibility. It's called the clog. There's no point in having one structured differently around the table. The whole point of the hint bits is that it's in the same place as the data. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
The whole point of the hint bits is that it's in the same place as the data. Yes, but the hint bits are currently causing us trouble on several features or potential features: * page-level CRC checks * eliminating vacuum freeze for cold data * index-only access * replication * this patch * etc. At a certain point, it's worth the trouble to handle them differently because of the other features that enables or makes much easier. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] multi-platform, multi-locale regression tests
I'm looking for some ideas on how to deal with the regression tests for the per-column collation feature. These are the issues: * The feature only works on some platforms (tentatively: Linux, Windows). - Possible solution: like xml test * The locale names are platform dependent, so there would need to be different test files per locale. * The test files need to use some non-ASCII characters. So far, I have encoded the test file in UTF-8 and run the tests with make check MULTIBYTE=UTF8. * Also, the allowed collations depend on the server encoding, so any solution for the previous point that results in the server encoding of the test database being variable will make the setup of the regression test SQL file more interesting. * Of course the actual sort orders could also be different on different platforms, but that problem can likely be contained. One possible way out is not to include these tests in the main test set and instead require manual invocation. Better ideas? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 8:12 PM, Josh Berkus j...@agliodbs.com wrote: The whole point of the hint bits is that it's in the same place as the data. Yes, but the hint bits are currently causing us trouble on several features or potential features: Then we might have to get rid of hint bits. But they're hint bits for a metadata file that already exists, creating another metadata file doesn't solve anything. Though incidentally all of the other items you mentioned are generic problems caused by with MVCC, not hint bits. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] W3C Specs: Web SQL
On 11/8/2010 4:47 PM, Alvaro Herrera wrote: Excerpts from Charles Pritchard's message of lun nov 08 20:25:21 -0300 2010: On 11/8/2010 3:03 PM, Alvaro Herrera wrote: Excerpts from Kevin Grittner's message of lun nov 08 19:30:54 -0300 2010: David Fetterda...@fetter.org wrote: That's not proof against a DoS What client API is? This spec gives free rein into every web user's system to webmasters. If this isn't terminally dangerous, I don't know what is. DoS is more-or-less the responsibility of the host to send up alerts like: This page is hanging, do you want to continue... or otherwise automatically close hanging queries. I classify that kind of approach to security as terminally dangerous, yes. I don't believe the webmaster is granted free rein: Disk quotas are enforced, data is separated per origin, hanging processes are up to the implementer, and postgres has plenty of settings for that. The day a privilege escalation is found and some webserver runs pg_read_file() on your browser, will be a sad one indeed. The default disk quota per origin is generally 5megs; beyond that, additional user interaction is requested. So 5 megs to a.example.com, 5 megs to b.example.com, and so on? Sounds, eh, great. I don't think it's fair to assume a privilege escalation will be found: using that argument, no software should ever run on a client/server. That said, NaCl and PNaCl are under active development and I've no doubt that Postgres could be compiled by the tool set in the future. http://code.google.com/p/nativeclient/ Still, that's a diversion from the topic: Postgres can run on workstations, with an audience of browser-oriented implementations. Postgres is more stable than Sqlite for enterprise-level activity, hardened/enterprise browser distributions would choose Postgres over Sqlite for Web SQL implementations. I don't think it's fair to assume a privilege escalation will be found: using that argument, no software should ever run on a client/server. That said, NaCl and PNaCl are under active development and I've no doubt that Postgres could be compiled by the tool set in the future. http://code.google.com/p/nativeclient/ Still, that's a diversion from the topic: Postgres can run on workstations, with an audience of browser-oriented implementations. Postgres is more stable than Sqlite for enterprise-level activity, hardened/enterprise browser distributions would choose Postgres over Sqlite for Web SQL implementations. And as for the quota issues: that's really up to the browser vendor. It's completely out of spec here. And it's how the web currently works for hundreds of millions of users: it's not introducing a security issue, as it reflects the current state of security. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] W3C Specs: Web SQL
On Tue, Nov 9, 2010 at 9:14 PM, Charles Pritchard wrote: Postgres is more stable than Sqlite for enterprise-level activity, hardened/enterprise browser distributions would choose Postgres over Sqlite for Web SQL implementations. I find that very unlikely. Web SQL is to be an upgrade from cookies as the client storage mechanism, it is not meant to be store a few TB in GIS data. Implementors will choose based on much more practical concerns such as portability (SQLite is not just available for Android and iOS, it is included), filesystem layout (put the databases in one folder for each domain just like Flash lays out its offline storage), embeddability (5 processes just to start a DB), recovery speed (when a mobile browser gets pushed from RAM and later fear back in, it has to replay a 16 MB WAL file) and even just convenience (how many browsers already use SQLite for bookmark storage?) will weight far heavier then some perceived enterprise readiness, Jochem -- Jochem van Dieten http://jochem.vandieten.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Then we might have to get rid of hint bits. But they're hint bits for a metadata file that already exists, creating another metadata file doesn't solve anything. Is there any way to instrument the writes of dirty buffers from the share memory, and see how many of the pages normally being written are not backed by WAL (hint-only updates)? Just dumping those buffers without writes would allow at least *checksums* to go throug without loosing all the benifits of the hint bits. I've got a hunch (with no proof) that the penalty of not writing them will be born largely by small database installs. Large OLTP databases probably won't have pages without a WAL'ed change and hint-bits set, and large data warehouse ones will probably vacuum freeze big tables on load to avoid the huge write penalty the 1st time they scan the tables... /waving hands -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Though incidentally all of the other items you mentioned are generic problems caused by with MVCC, not hint bits. Yes, but the hint bits prevent us from implementing workarounds. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 2:05 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the buffer tight) 2) saving hint-bits on an otherwise unchanged page. We either need to just not write that page, and loose the work the hint-bits did, or do a full-page WAL of it, so the torn-page checksum is fixed Actually the consensus the last go-around on this topic was to segregate the hint bits into a single area of the page and skip them in the checksum. That way we don't have to do any of the above. It's just that that's a lot of work. And it still allows silent data corruption, because bogusly clearing a hint bit is, at the moment, harmless, but bogusly setting one is not. I really have to wonder how other products handle this. PostgreSQL isn't the only database product that uses MVCC - not by a long shot - and the problem of detecting whether an XID is visible to the current snapshot can't be ours alone. So what do other people do about this? They either don't cache the information about whether the XID is committed in-page (in which case, are they just slower or do they have some other means of avoiding the performance hit?) or they cache it in the page (in which case, they either WAL log it or they don't checksum it). I mean, there aren't any other options, are there? An examination of the MySQL source code reveals their answer. In row_vers_build_for_semi_consistent_read(), which I can't swear is the right place but seems to be, there is this comment: /* We assume that a rolled-back transaction stays in TRX_ACTIVE state until all the changes have been rolled back and the transaction is removed from the global list of transactions. */ Which makes sense. If you never leave rows from aborted transactions in the heap forever, then the list of aborted transactions that you need to remember for MVCC purposes will remain relatively small and you can just include those XIDs in your MVCC snapshot. Our problem is that we have no particular bound on the number of aborted transactions whose XIDs may still be floating around, so we can't do it that way. dons asbestos underpants To impose a similar bound in PostgreSQL, you'd need to maintain the set of aborted XIDs and the relations that need to be vacuumed for each one. As you vacuum, you prune any tuples with aborted xmins (which is WAL-logged already anyway) and additionally WAL-log clearing the xmax for each tuple with an aborted xmax. Thus, when you finishing vacuuming the relation, the aborted XID is no longer present anywhere in it. When you vacuum the last relation for a particular XID, that XID no longer exists in the relation files anywhere and you can remove it from the list of aborted XIDs. I think that WAL logging the list of XIDs and list of unvacuumed relations for each at each checkpoint would be sufficient for crash safety. If you did this, you could then assume that any XID which precedes your snapshot's xmin is committed. 1. When a big abort happens, you may have to carry that XID around in every snapshot - and avoid advancing RecentGlobalXmin - for quite a long time. 2. You have to WAL log marking the XMAX of an aborted transaction invalid. 3. You have to WAL log the not-yet-cleaned-up XIDs and the relations each one needs vacuumed at each checkpoint. 4. There would presumably be some finite limit on the size of the shared memory structure for aborted transactions. I don't think there'd be any reason to make it particularly small, but if you sat there and aborted transactions at top speed you might eventually run out of room, at which point any transactions you started wouldn't be able to abort until vacuum made enough progress to free up an entry. 5. It would be pretty much impossible to run with autovacuum turned off, and in fact you would likely need to make it a good deal more aggressive in the specific case of aborted transactions, to mitigate problems #1, #3, and #4. I'm not sure how bad those things would be, or if there are more that I'm missing (besides the obvious it would be a lot of work). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-platform, multi-locale regression tests
On Nov 9, 2010, at 12:18 PM, Peter Eisentraut wrote: One possible way out is not to include these tests in the main test set and instead require manual invocation. Better ideas? I've been talking to Nasby and Dunstan about adding a Test::More/pgTAP-based test suite to the core. It wouldn't run with the usual core suite used by developers, which would continue to use pg_regress. But they could run it if they wanted (and had the prerequisites), and the build farm animals would run them regularly. The nice thing about using a TAP-based framework is that you can skip tests that don't meet platform requirements, and compare values within the tests, right where you write them, rather than in a separate file. You can also dynamically change how you compare things depending on the environment, such as the locales that vary on different platforms. Thoughts? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On 11/9/10 1:50 PM, Robert Haas wrote: 5. It would be pretty much impossible to run with autovacuum turned off, and in fact you would likely need to make it a good deal more aggressive in the specific case of aborted transactions, to mitigate problems #1, #3, and #4. 6. This would require us to be more aggressive about VACUUMing old-cold relations/page, e.g. VACUUM FREEZE. This it would make one of our worst issues for data warehousing even worse. What about having this map (and other hintbits) be per-relation? Hmmm. That wouldn't work for DDL, I suppose ... -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Josh Berkus j...@agliodbs.com wrote: 6. This would require us to be more aggressive about VACUUMing old-cold relations/page, e.g. VACUUM FREEZE. This it would make one of our worst issues for data warehousing even worse. I continue to feel that it is insane that when a table is populated within the same database transaction which created it (e.g., a bulk load of a table or partition), that we don't write the tuples with hint bits set for commit and xmin frozen. By the time any but the creating transaction can see the tuples, *if* any other transaction is ever able to see the tuples, these will be the correct values; we really should be able to deal with it within the creating transaction somehow. If we ever handle that, would #6 be a moot point, or do you think it's still a significant issue? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Build farm server database migration complete
It took a little longer than expected, due to a slightly clagged network between the old and new servers, but the database migration is complete and the server is back up and running. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 5:03 PM, Josh Berkus j...@agliodbs.com wrote: On 11/9/10 1:50 PM, Robert Haas wrote: 5. It would be pretty much impossible to run with autovacuum turned off, and in fact you would likely need to make it a good deal more aggressive in the specific case of aborted transactions, to mitigate problems #1, #3, and #4. 6. This would require us to be more aggressive about VACUUMing old-cold relations/page, e.g. VACUUM FREEZE. This it would make one of our worst issues for data warehousing even worse. Uh, no it doesn't. It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array. It doesn't affect transaction wraparound vacuuming at all, either positively or negatively. You still have to freeze xmins before they flip from being in the past to being in the future, but that's it. What about having this map (and other hintbits) be per-relation? Hmmm. That wouldn't work for DDL, I suppose ... This map? I suppose you could track aborted XIDs per relation instead of globally, but I don't see why that would affect DDL any differently than anything else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 5:15 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Josh Berkus j...@agliodbs.com wrote: 6. This would require us to be more aggressive about VACUUMing old-cold relations/page, e.g. VACUUM FREEZE. This it would make one of our worst issues for data warehousing even worse. I continue to feel that it is insane that when a table is populated within the same database transaction which created it (e.g., a bulk load of a table or partition), that we don't write the tuples with hint bits set for commit and xmin frozen. By the time any but the creating transaction can see the tuples, *if* any other transaction is ever able to see the tuples, these will be the correct values; we really should be able to deal with it within the creating transaction somehow. I agree. If we ever handle that, would #6 be a moot point, or do you think it's still a significant issue? I think it's a moot point anyway, per previous email. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 3:05 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 7:37 PM, Josh Berkus j...@agliodbs.com wrote: Well, most of the other MVCC-in-table DBMSes simply don't deal with large, on-disk databases. In fact, I can't think of one which does, currently; while MVCC has been popular for the New Databases, they're all focused on in-memory databases. Oracle and InnoDB use rollback segments. Well rollback segments are still MVCC. However Oracle's MVCC is block-based. So they only have to do the visibility check once per block, not once per row. Once they find the right block version they can process all the rows on it. Also Oracle's snapshots are just the log position. Instead of having to check whether every transaction committed or not, they just find the block version which was last modified before the log position for when their transaction started. That is cool. One problem is that it might sometimes result in additional I/O. A transaction begins and writes a tuple. We must write a preimage of the page (or at least, sufficient information to reconstruct a preimage of the page) to the undo segment. If the transaction commits relatively quickly, and all transactions which took their snapshots before the commit end either by committing or by aborting, we can discard that information from the undo segment without ever writing it to disk. However, if that doesn't happen, the undo log page may get evicted, and we're now doing three writes (WAL, page, undo) rather than just two (WAL, page). That's no worse than an update where the old and new tuples land on different pages, but it IS worse than an update where the old and new tuples are on the same page, or at least I think it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-platform, multi-locale regression tests
2010/11/9 David E. Wheeler da...@kineticode.com: On Nov 9, 2010, at 12:18 PM, Peter Eisentraut wrote: One possible way out is not to include these tests in the main test set and instead require manual invocation. Better ideas? I've been talking to Nasby and Dunstan about adding a Test::More/pgTAP-based test suite to the core. It wouldn't run with the usual core suite used by developers, which would continue to use pg_regress. But they could run it if they wanted (and had the prerequisites), and the build farm animals would run them regularly. The nice thing about using a TAP-based framework is that you can skip tests that don't meet platform requirements, and compare values within the tests, right where you write them, rather than in a separate file. You can also dynamically change how you compare things depending on the environment, such as the locales that vary on different platforms. Thoughts? Are you thinking of a contrib module 'pgtap' that we can use to accomplish the optionnal regression tests ? -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-platform, multi-locale regression tests
On Nov 9, 2010, at 2:42 PM, Cédric Villemain wrote: Are you thinking of a contrib module 'pgtap' that we can use to accomplish the optionnal regression tests ? Oh, if the project wants it in contrib, sure. Otherwise I'd probably just have the test stuff include it somehow. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Robert, Uh, no it doesn't. It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array. It doesn't affect transaction wraparound vacuuming at all, either positively or negatively. You still have to freeze xmins before they flip from being in the past to being in the future, but that's it. Sorry, I was trying to say that it's similar to the freeze issue, not that it affects freeze. Sorry for the lack of clarity. What I was getting at is that this could cause us to vacuum relations/pages which would otherwise never be vaccuumed (or at least, not until freeze). Imagine a very large DW table which is normally insert-only and seldom queried, but once a month or so the insert aborts and rolls back. I'm not saying that your proposal isn't worth testing. I'm just saying that it may prove to be a net loss to overall system efficiency. If we ever handle that, would #6 be a moot point, or do you think it's still a significant issue? Kevin, the case which your solution doesn't fix is the common one of log tables which keep adding records continuously, with 5% inserts or updates. That may seem like a corner case but such a table, partitioned or unpartitioned, exists in around 1/3 of the commercial applications I've worked on, so it's a common pattern. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
On Mon, Nov 8, 2010 at 9:52 PM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: Are there any activities in JSON data types for the next commitfest? I'm leaning toward the view that we shouldn't commit a JSON implementation to core (or contrib) for 9.1. We have at least three competing proposals on the table. I thought of picking it up and hacking on it myself, but then we'd have four competing proposals on the table. Even if we could come to some consensus on which of those proposals is technically superior, the rate at which new ideas are being proposed suggests to me that it would be premature to anoint any single implementation as our canonical one. I'd like to see some of these patches finished and put up on pgfoundry or github, and then consider moving one of them into core when we have a clear and stable consensus that one of them is technically awesome and the best thing we're going to get. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-platform, multi-locale regression tests
2010/11/9 David E. Wheeler da...@kineticode.com: On Nov 9, 2010, at 2:42 PM, Cédric Villemain wrote: Are you thinking of a contrib module 'pgtap' that we can use to accomplish the optionnal regression tests ? Oh, if the project wants it in contrib, sure. Otherwise I'd probably just have the test stuff include it somehow. Adding a unit test layer shipped with postgresql sounds good to me. And pgTAP can claim to be platform agnostic. -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
Robert, I think I agree. At a minimum, I would like to see the chosen of the competing priorities live on as an outside module for use by previous versions. Even having proposed one, and soon to be two of the competing implementations, it makes me nervous to commit to one at this juncture. I'm wrapping some items up this week but expect to have some time over the next two weeks to complete my implementation. Here's a quick status on where I'm at: - Binary format has been implemented as specified here: https://github.com/tlaurenzo/pgjson/blob/master/pgjson/jsonlib/BINARY-README.txt - Hand coded a JSON-text lexer/parser and JSON-binary parser and transcoders - Ran out of time to do exhaustive tests, but typical payloads yield significant space savings - Based on an admittedly small number of test cases, I identified that the process of encoding a string literal is the most expensive operation in the general case, accounting for 1/2 to 2/3 of the time spent in a transcoding operation. This is fairly obvious but good to know. I did not spend any time looking into this further. - Drastically simplified the code in preparation to build a stand-alone module As soon as I get a bit of time I was going to do the following: - Create a simple PGXS based build, stripping out the rest of the bits I was doodling on - Re-implement the PG module based on the new jsonlib binary format and parser - Add JSONPath and some encoding bits back in from the original patch - Do some holistic profiling between the JSON-as-Text approach and the JSON-as-Binary approach This is still a bit of a fishing expedition, imo and I would have a hard time getting this ready for commit on Monday. If getting something in right now is critical, Joey's original patch is the most complete at this point. Terry On Tue, Nov 9, 2010 at 3:48 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 8, 2010 at 9:52 PM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: Are there any activities in JSON data types for the next commitfest? I'm leaning toward the view that we shouldn't commit a JSON implementation to core (or contrib) for 9.1. We have at least three competing proposals on the table. I thought of picking it up and hacking on it myself, but then we'd have four competing proposals on the table. Even if we could come to some consensus on which of those proposals is technically superior, the rate at which new ideas are being proposed suggests to me that it would be premature to anoint any single implementation as our canonical one. I'd like to see some of these patches finished and put up on pgfoundry or github, and then consider moving one of them into core when we have a clear and stable consensus that one of them is technically awesome and the best thing we're going to get. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Josh Berkus j...@agliodbs.com writes: Though incidentally all of the other items you mentioned are generic problems caused by with MVCC, not hint bits. Yes, but the hint bits prevent us from implementing workarounds. If we got rid of hint bits, we'd need workarounds for the ensuing massive performance loss. There is no reason whatsoever to imagine that we'd come out ahead in the end. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
Robert Haas robertmh...@gmail.com writes: dons asbestos underpants 4. There would presumably be some finite limit on the size of the shared memory structure for aborted transactions. I don't think there'd be any reason to make it particularly small, but if you sat there and aborted transactions at top speed you might eventually run out of room, at which point any transactions you started wouldn't be able to abort until vacuum made enough progress to free up an entry. Um, that bit is a *complete* nonstarter. The possibility of a failed transaction always has to be allowed. What if vacuum itself gets an error for example? Or, what if the system crashes? I thought for a bit about inverting the idea, such that there were a limit on the number of unvacuumed *successful* transactions rather than the number of failed ones. But that seems just as unforgiving: what if you really need to commit a transaction to effect some system state change? An example might be dropping some enormous table that you no longer need, but vacuum is going to insist on plowing through before it'll let you have any more transactions. I'm of the opinion that any design that presumes it can always fit all the required transaction-status data in memory is probably not even worth discussing. There always has to be a way for status data to spill to disk. What's interesting is how you can achieve enough locality of access so that most of what you need to look at is usually in memory. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 5:45 PM, Josh Berkus j...@agliodbs.com wrote: Robert, Uh, no it doesn't. It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array. It doesn't affect transaction wraparound vacuuming at all, either positively or negatively. You still have to freeze xmins before they flip from being in the past to being in the future, but that's it. Sorry, I was trying to say that it's similar to the freeze issue, not that it affects freeze. Sorry for the lack of clarity. What I was getting at is that this could cause us to vacuum relations/pages which would otherwise never be vaccuumed (or at least, not until freeze). Imagine a very large DW table which is normally insert-only and seldom queried, but once a month or so the insert aborts and rolls back. Oh, I see. In that case, under the proposed scheme, you'd get an immediate vacuum of everything inserted into the table since the last failed insert. Everything prior to the last failed insert would be OK, since the visibility map bits would already be set for those pages. Yeah, that would be annoying. There's a related problem with index-only scans. If a large DW table which is normally insert-only, but which IS queried regularly, it won't be able to use index-only scans effectively because without regularly vacuuming, the visibility map bits won't be set. We've previously discussed the possibility of having the background writer set hint bits before writing the pages, and maybe it could even set the all-visible bit and update the visibility map, too. But that won't help if the transaction inserts a large enough quantity of data that it starts spilling buffers to disk before it commits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Wed, Nov 10, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Once you know that there is, or isn't, a filesystem-level error involved, what are you going to do next? You're going to go try to debug the component you know is at fault, that's what. And that problem is still AI-complete. If we know for sure that Postgres was not at fault then we have standby node to failover to, where Postgres warm standby is being maintained by streaming replication. Regards -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.EnterpriseDB.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 6:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: dons asbestos underpants 4. There would presumably be some finite limit on the size of the shared memory structure for aborted transactions. I don't think there'd be any reason to make it particularly small, but if you sat there and aborted transactions at top speed you might eventually run out of room, at which point any transactions you started wouldn't be able to abort until vacuum made enough progress to free up an entry. Um, that bit is a *complete* nonstarter. The possibility of a failed transaction always has to be allowed. What if vacuum itself gets an error for example? Or, what if the system crashes? I wasn't proposing that it was impossible to abort, only that aborts might have to block. I admit I don't know what to do about VACUUM itself failing. A transient failure mightn't be so bad, but if you find yourself permanently unable to eradicate the XIDs left behind by an aborted transaction, you'll eventually have to shut down the database, lest the XID space wrap around. Actually, come to think of it, there's no reason you COULDN'T spill the list of aborted-but-not-yet-cleaned-up XIDs to disk. It's just that XidInMVCCSnapshot() would get reeally expensive after a while. I thought for a bit about inverting the idea, such that there were a limit on the number of unvacuumed *successful* transactions rather than the number of failed ones. But that seems just as unforgiving: what if you really need to commit a transaction to effect some system state change? An example might be dropping some enormous table that you no longer need, but vacuum is going to insist on plowing through before it'll let you have any more transactions. The number of relevant aborted XIDs tends naturally to decline to zero as vacuum does its thing, while the number of relevant committed XIDs tends to grow very, very large (it starts to decline only when we start freezing things), so remembering the not-yet-cleaned-up aborted XIDs seems likely to be cheaper. In fact, in many cases, the set of not-yet-cleaned-up aborted XIDs will be completely empty. I'm of the opinion that any design that presumes it can always fit all the required transaction-status data in memory is probably not even worth discussing. Well, InnoDB does it. There always has to be a way for status data to spill to disk. What's interesting is how you can achieve enough locality of access so that most of what you need to look at is usually in memory. We're not going to get any more locality of reference than we're already getting from hint bits, are we? The advantage of trying to do timely cleanup of aborted transactions is that you can assume that any XID before RecentGlobalXmin is committed, without checking CLOG and without having to update hint bits and write out the ensuing dirty pages. If we could make CLOG access cheap enough that we didn't need hint bits, that would also solve that problem, but nobody (including me) seems to think that's feasible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DROP TABLESPACE needs crash-resistance
On Wed, Nov 10, 2010 at 1:24 AM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh singh.gurj...@gmail.com writes: We are facing a problem in dropping a tablespace after crash recovery. The recovery starts from the last checkpoint, but the tables that were created by a transaction in a tablespace before the checkpoint are still lying around; the transaction had not finished by the time of crash. After recovery, when the app tries to drop the tablespace, the command fails because the tablespace directory is not empty. Hmm. The reason DROP TABLESPACE fails in that case, rather than just arbitrarily rm -rf'ing the files, is fear of deleting valuable data by accident. I suppose we could have a mode that deletes the files without any manual intervention, but personally I'd regard that as a foot-gun. That'd be a lot of help, especially when the DBA/app knows that there's nothing supposed to be leftover in that tablespace; maybe querying different pg_class from all the databases would act as a cross check. The problem is that if there are open transactions who just created a relation in that tablespace, that record won't be visible in pg_class. Also, we don't take any kind of lock on a tablespace when opening a relation, so we cannot be sure if there are no running transactions with an open relation from that tablespace (I guess this is moot if DBA/app knows there's nothing supposed to be in the tablespace, but this doesn't hold for any new connections trying to create tables there). Join between pg_locks and pg_class would have been helpful, but pg_locks doesn't hold tablespace oid, and pg_class rows may not be visible yet). Solving this problem has become quite critical since the the platform where Postgres is being used is supposed to run unattended. I'm not entirely clear as to the use-case for unattended DROP TABLESPACE? That doesn't really seem like an operation you should need on a routine basis. For every new element to be managed, the application creates all the relevant objects in a new schema, and assigns all the objects in that schema to a new tablespace. So when that element supposed to be removed, we need to drop schema and the associated tablespace. Regardless, having Postgres leave its trash behind is not desirable in any scenario, so a solution that clears such files at the end of recovery would be much more desirable. Regards, -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.EnterpriseDB.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] timestamp of the last replayed transaction
On Wed, Nov 10, 2010 at 3:28 AM, Josh Berkus j...@agliodbs.com wrote: On 11/9/10 5:44 AM, Fujii Masao wrote: But, pg_last_xact_replay_timestamp is more intuitive for many people? If so, let's change the name. *None* of these names are intuitive. So let's just go for consistency. OK. I changed the name to pg_last_xact_replay_timestamp. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center pg_last_xact_replay_timestamp_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protecting against unexpected zero-pages: proposal
On Tue, Nov 9, 2010 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 9, 2010 at 5:45 PM, Josh Berkus j...@agliodbs.com wrote: Robert, Uh, no it doesn't. It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array. It doesn't affect transaction wraparound vacuuming at all, either positively or negatively. You still have to freeze xmins before they flip from being in the past to being in the future, but that's it. Sorry, I was trying to say that it's similar to the freeze issue, not that it affects freeze. Sorry for the lack of clarity. What I was getting at is that this could cause us to vacuum relations/pages which would otherwise never be vaccuumed (or at least, not until freeze). Imagine a very large DW table which is normally insert-only and seldom queried, but once a month or so the insert aborts and rolls back. Oh, I see. In that case, under the proposed scheme, you'd get an immediate vacuum of everything inserted into the table since the last failed insert. Everything prior to the last failed insert would be OK, since the visibility map bits would already be set for those pages. Yeah, that would be annoying. Ah, but it might be fixable. You wouldn't really need to do a full-fledged vacuum. It would be sufficient to scan the heap pages that might contain the XID we're trying to clean up after, without touching the indexes. Instead of actually removing tuples with an aborted XMIN, you could just mark the line pointers LP_DEAD. Tuples with an aborted XMAX don't require touching the indexes anyway. So as long as you have some idea which segment of the relation was potentially dirtied by that transaction, you could just scan those blocks and update the item pointers and/or XMAX values for the offending tuples without doing anything else (although you'd probably want to opportunistically grab the buffer cleanup lock and defragment if possible). Unfortunately, I'm now realizing another problem. During recovery, you have to assume that any XIDs that didn't commit are aborted; under the scheme I proposed upthread, if a transaction that was in-flight at crash time had begun prior to the last checkpoint, you wouldn't know which relations it had potentially dirtied. Ouch. But I think this is fixable, too. Let's invent a new on-disk structure called the content-modified log. Transactions that want to insert, update, or delete tuples allocate pages from this structure. The header of each page stores the XID of the transaction that owns that page and the ID of the database to which that transaction is bound. Following the header, there are a series of records of the form: tablespace OID, table OID, starting page number, ending page number. Each such record indicates that the given XID may have put its XID on disk within the given page range of the specified relation. Each checkpoint flushes the dirty pages of the modified-content log to disk along with everything else. Thus, on redo, we can reconstruct the additional entries that need to be added to the log from the contents of WAL subsequent to the redo pointer. If a transaction commits, we can remove all of its pages from the modified-content log; in fact, if a transaction begins and commits without an intervening checkpoint, the pages never need to hit the disk at all. If a transaction aborts, its modified-content log pages must stick around until we've eradicated any copies of its XID in the relation data files. We maintain a global value for the oldest aborted XID which is not yet fully cleaned up (let's called this the OldestNotQuiteDeadYetXID). When we see an XID which precedes OldestNotQuiteDeadYetXID, we know it's committed. Otherwise, we check whether the XID precedes the xmin of our snapshot. If it does, we have to check whether the XID is committed or aborted (it must be one or the other). If it does not, we use our snapshot, as now. Checking XIDs between OldestNotQuiteDeadYetXID and our snapshot's xmin is potentially expensive, but (1) if there aren't many aborted transactions, this case shouldn't arise very often; (2) if the XID turns out to be aborted and we can get an exclusive buffer content lock, we can nuke that copy of the XID to save the next guy the trouble of examining it; and (3) we can maintain a size-limited per-backend cache of this information, which should help in the normal cases where there either aren't that many XIDs that fall into this category or our transaction doesn't see all that many of them. This also addresses Tom's concern about needing to store all the information in memory, and the need to WAL-log not-yet-cleaned-up XIDs at each checkpoint. You still need to aggressively clean up after aborted transactions, either using our current vacuum mechanism or the just zap the XIDs shortcut described above. (An additional interesting point about this design is that you could potentially also use it to
Re: [HACKERS] timestamp of the last replayed transaction
On Tue, Nov 9, 2010 at 8:55 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Nov 10, 2010 at 3:28 AM, Josh Berkus j...@agliodbs.com wrote: On 11/9/10 5:44 AM, Fujii Masao wrote: But, pg_last_xact_replay_timestamp is more intuitive for many people? If so, let's change the name. *None* of these names are intuitive. So let's just go for consistency. OK. I changed the name to pg_last_xact_replay_timestamp. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] We need index-only scans
We last researched index-only scans, also called covering indexes, in September of 2008, but have made little progress on it since. Many have been waiting for Heikki to implement this but I talked to him and he doesn't have time. I believe it is time for the community to move forward and I would like to assemble a team to work on this feature. We might not be able to implement it for Postgres 9.1, but hopefully we can make some progress on this. I have created a wiki page for this TODO item: http://wiki.postgresql.org/wiki/Index-only_scans I am interested in people improving this wiki page, and in people discussing and coding some of the items needed to implement this feature. I have personally talked to a few people already who seemed receptive. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp of the last replayed transaction
On Wed, Nov 10, 2010 at 12:57 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 9, 2010 at 8:55 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Nov 10, 2010 at 3:28 AM, Josh Berkus j...@agliodbs.com wrote: On 11/9/10 5:44 AM, Fujii Masao wrote: But, pg_last_xact_replay_timestamp is more intuitive for many people? If so, let's change the name. *None* of these names are intuitive. So let's just go for consistency. OK. I changed the name to pg_last_xact_replay_timestamp. Committed. Thanks! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security hooks on object creation
2010/11/9 KaiGai Kohei kai...@ak.jp.nec.com: The attached patch provides plugin modules a hook just after object creation time. In typical use cases, it enables to assign default security labels on object creation by the external security providers. As Robert suggested before, it provides a generic purpose main hook. It takes an enum of ObjectAccessType which informs plugins what kind of accesses are required, and identifier of the object to be referenced. But, in this version, no additional information, such as new name in ALTER xxx RENAME TO, are not supported. The ObjectAccessType is defined as follows: typedef enum ObjectAccessType { OAT_POST_CREATE, /* Post creation fixups; such as security labeling */ } ObjectAccessType; We will support more complete kind of access types in the future version, however, we focus on default labeling rather than DDL permissions right now, so only OAT_POST_CREATE is defined here. Perhaps, we will add OAT_ALTER, OAT_DROP, OAT_COMMENT and so on. In this patch, I put hooks on the place just after creation of database objects that we can assign security labels. (schema, relation, attribute, procedure, language, type, large object) However, I didn't touch or move CommandCounterIncrement() yet, although we had a long discussion MVCC visibility of new object. Because I'm not clear whether it is really preferable to inject CCIs onto random points such as TypeCreate() or ProcedureCreate() under development of the version killed by myself. (In other words, it was simply ugly...) At least, we can see the new entries with SnapshotSelf, although we will pay performance penalty. If so, it is an idea not to touch anything related to CCIs. The purpose of post creation hooks are assignment of default security labels, not DDL permissions. So, it is not a bad idea not to touch routines related to CCIs in the earlier version of external security provider. In this patch, we put InvokeObjectAccessHook0 on the following functions. - heap_create_with_catalog() for relations/attributes - ATExecAddColumn() for attributes - NamespaceCreate() for schemas - ProcedureCreate() for aggregates/functions - TypeCreate() and TypeShellMake() for types - create_proc_lang() for procedural languages - inv_create() for large objects I think you ought to try to arrange to avoid the overhead of a function call in the common case where nobody's using the hook. That's why I originally suggested making InvokeObjectAccessHook() a macro around the actual function call. I don't want to refer to this as a framework for enhanced security providers. Let's stick with the term object access hook. Calling it an enhanced security provider overspecifies; it could equally well be used for, say, logging. Is there any compelling reason not to apply this to every object type in the system (e.g. all the ones COMMENT can apply to)? I don't see any reason to restrict it to the set of objects to which it's sensible to apply security labels. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Which file does the SELECT?
On Mon, Nov 8, 2010 at 9:55 PM, Vaibhav Kaushal vaibhavkaushal...@gmail.com wrote: I have started with the work and am using Eclipse and it helps quite a lot. I can find the declarations quite easily. Thanks to open Source. BTW, I am encountering too many (just too many) data types as I try to understand the backend (specifically the executor). I do think that its normal because the executor has to consider almost everything that the other parts of the DB can possibly command it to do. Is there some documentation available on the data types / structures which are in use at the backend? There's less than one might hope. I think you pretty much have to look through the README files and source code comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-platform, multi-locale regression tests
On tis, 2010-11-09 at 14:00 -0800, David E. Wheeler wrote: I've been talking to Nasby and Dunstan about adding a Test::More/pgTAP-based test suite to the core. It wouldn't run with the usual core suite used by developers, which would continue to use pg_regress. But they could run it if they wanted (and had the prerequisites), and the build farm animals would run them regularly. I'd welcome something like that, but I'm not sure that that's the best overall solution to this particular problem in the short run. But it would be great to have anyway. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers