Re: Plugging fd leaks (was Re: [HACKERS] Switching timeline over streaming replication)
On 26.11.2012 14:53, Amit Kapila wrote: On Friday, November 23, 2012 7:03 PM Heikki Linnakangas This is what I came up with. It adds a new function, OpenFile, that returns a raw file descriptor like BasicOpenFile, but the file descriptor is associated with the current subtransaction and automatically closed at end-of-xact, in the same way that AllocateFile and AllocateDir work. In other words, OpenFile is to open() what AllocateFile is to fopen(). BasicOpenFile is unchanged, it returns a raw fd and it's solely the caller's responsibility to close it, but many of the places that used to call BasicOpenFile now use the safer OpenFile function instead. This patch plugs three existing fd (or virtual fd) leaks: 1. copy_file() - fixed by by using OpenFile instead of BasicOpenFile 2. XLogFileLinit() - fixed by adding close() calls to the error cases. Can't use OpenFile here because the fd is supposed to persist over transaction boundaries. 3. lo_import/lo_export - fixed by using OpenFile instead of PathNameOpenFile. I have gone through the patch and find it okay except for one minor suggestion 1. Can we put below log in OpenFile as well +DO_DB(elog(LOG, CloseFile: Allocated %d, numAllocatedDescs)); Thanks. Added that and committed. I didn't dare to backpatch this, even though it could be fairly easily backpatched. The leaks exist in older versions too, but since they're extremely rare (zero complaints from the field and it's been like that forever), I didn't want to take the risk. Maybe later, after this has had more testing in master. One thing I'm not too fond of is the naming. I'm calling the new functions OpenFile and CloseFile. There's some danger of confusion there, as the function to close a virtual file opened with PathNameOpenFile is called FileClose. OpenFile is really the same kind of operation as AllocateFile and AllocateDir, but returns an unbuffered fd. So it would be nice if it was called AllocateSomething, too. But AllocateFile is already taken. And I don't much like the Allocate* naming for these anyway, you really would expect the name to contain open. OpenFileInTrans OpenTransactionAwareFile In anycase OpenFile is also okay. I ended up calling the functions OpenTransientFile and CloseTransientFile. Windows has a library function called OpenFile, so that was a pretty bad choice after all. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Sun, 2012-11-25 at 21:08 -0500, Robert Haas wrote: That, however, is a separate question from what's under discussion here, because the case at issue for the proposed patch is the one in which only one possible candidate exists, and the question is whether we ought to allow the use of assignment casts to allow the call to work rather than fail, NOT which of several overloaded functions we ought to pick. In any situation in which overloading is in use, the patch as proposed changes nothing. I'm not generally very good at interpreting the SQL standard text, but if it says that you ought to use assignment casts to match actual argument types to the chosen candidate function, then that seems like it's advocating for essentially the same position that you arrived at independently and that the patch also takes, which furthermore happens to be compatible with what other RDBMS systems do, at least in the no-overloading case. Let's say you have only one function foo. All your queries are coded into your application, and everything works fine, using assignment casts where necessary. Then the user is foolish enough to CREATE FUNCTION foo... and now their queries start failing left and right. In other words, only one possible candidate exists should be followed by right now to be more precise. That's a major violation of the principle of least astonishment, that CREATE FUNCTION could cause such a disaster. I know that it can now, but what you're proposing will come into play much more frequently because most people start off with just one function by a particular name and define more as needed. If we do something like this, I think we should explicitly opt out of the overloading feature at DDL time (somewhat like what Simon suggested in another reply). E.g. CREATE {UNIQUE|OVERLOADED} FUNCTION ... I'm not proposing that; in general I am very wary of changes to the type system. I'm just saying that, if we do have special rules, we should have a way to make sure that users know when the rules are changing. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gset updated patch
Just another thought ! When we are setting up the variable using \gset, I feel their should be a provision to drop a particular variable. Internally, all the variables are set into VariableSpace linked-list. We should provide a command to Drop a particular variable, because in some cases unnecessary the variable count is increasing consuming a VariableSpace. We might use two different variables for two different queries, but if we are not going to use the first variable in further execution, then unnecessary we are consuming a space for 1st variable in the VariableSpace. In such cases, user will drop the 1st variable. This particular feature/mechanism is useful for a queries which returns a single row. So user will be using such technique for multiple queries. In such cases, user might need to create multiple variables. Hence I thoughts so. Let me know if such mechanism is already exists I am missing the same. On Mon, Nov 19, 2012 at 9:42 PM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Karl O. Pinc k...@meme.com writes: Yes. I'm wrong. For some reason I thought you could use DO to make an anonymous code block that would act as a SETOF function, allowing RETURN NEXT expr (et-al) to be used in the plpgsql code, allowing DO to return table results. (Or, perhaps, instead, be used in place of a table in a SELECT statement.) Oh well. My key for remembering about that point is that DO is a utility command, not a query. Now, the proposal I pushed last time we opened that very can of worms was to have inline functions rather than anonymous code blocks: WITH FUNCTION foo(integer) returns bigint language SQL AS $$ SELECT $1 + 1; $$, Not sure how much that relates to $topic, but still something that raises in my mind with enough presence that I need to write about it so that it stops calling for attention :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- -- Piyush S Newe Principal Engineer EnterpriseDB office: +91 20 3058 9500 www.enterprisedb.com Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Re: [HACKERS] ilist.h fails cpluspluscheck
On 2012-11-27 01:14:27 -0500, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: In file included from ./src/include/utils/catcache.h:25:0, from /tmp/cpluspluscheck.bt8VZr/test.cpp:3: src/include/lib/ilist.h: In function ‘dlist_node* dlist_head_node(dlist_head*)’: src/include/lib/ilist.h:470:39: error: invalid conversion from ‘void*’ to ‘dlist_node*’ [-fpermissive] Maybe some ifndef __cplusplus would help. Or maybe we need to recommend use of -fpermissive? If C++ thinks casting void * to something else is illegitimate, it's basically not going to cope with most things we might try to inline in Postgres. And I don't think that saying you don't get to call these fundamental support functions from C++ is likely to fly, so just hiding the functions won't help much. Its rather easy to fix in the ilist code at least - the cases it points out are those where I took a slightly ugly shortcut to reduce some very minor code duplication... Some casts fix it, alternatively the Assert(!dlist_is_empty()); return head-head.next could be moved into [ds]list_(head|tail)_node instead of relying on the *_elemen_off support functions. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 8089edcc6d448529dbd091ec95a69e138a55efa6 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Tue, 27 Nov 2012 10:55:37 +0100 Subject: [PATCH] Fix some warnings in ilist.h pointed out by cpluspluscheck via Peter --- src/include/lib/ilist.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/include/lib/ilist.h b/src/include/lib/ilist.h index bc684b8..8f58486 100644 --- a/src/include/lib/ilist.h +++ b/src/include/lib/ilist.h @@ -467,7 +467,7 @@ dlist_head_element_off(dlist_head *head, size_t off) STATIC_IF_INLINE dlist_node * dlist_head_node(dlist_head *head) { - return dlist_head_element_off(head, 0); + return (dlist_node *) dlist_head_element_off(head, 0); } /* internal support function to get address of tail element's struct */ @@ -484,7 +484,7 @@ dlist_tail_element_off(dlist_head *head, size_t off) STATIC_IF_INLINE dlist_node * dlist_tail_node(dlist_head *head) { - return dlist_tail_element_off(head, 0); + return (dlist_node *) dlist_tail_element_off(head, 0); } #endif /* PG_USE_INLINE || ILIST_INCLUDE_DEFINITIONS */ @@ -677,7 +677,7 @@ slist_head_element_off(slist_head *head, size_t off) STATIC_IF_INLINE slist_node * slist_head_node(slist_head *head) { - return slist_head_element_off(head, 0); + return (slist_node *) slist_head_element_off(head, 0); } #endif /* PG_USE_INLINE || ILIST_INCLUDE_DEFINITIONS */ -- 1.7.12.289.g0ce9864.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote: It would be useful if we issued a NOTICE when an ambiguity is introduced, rather than when using it. Like Bison's reporting of reduce conflicts. This brings up a very important point, which is that a lot of the code is frozen in applications yet invisible at DDL time. So we have to be careful that DDL changes have a reasonable impact on the ability to continue to compile and execute the previously-working SQL received from the applications. In other words, as I said in another reply, we want to avoid cases where something seemingly innocuous (like creating a function) causes previously-working SQL to fail due to ambiguity. As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not suggesting that. And I know that creating a function can already cause previously-working SQL to fail. I'm just saying we should be careful of these situations and not make them more likely than necessary. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gset updated patch
Hello 2012/11/27 Piyush Newe piyush.n...@enterprisedb.com: Just another thought ! When we are setting up the variable using \gset, I feel their should be a provision to drop a particular variable. Internally, all the variables are set into VariableSpace linked-list. We should provide a command to Drop a particular variable, because in some cases unnecessary the variable count is increasing consuming a VariableSpace. We might use two different variables for two different queries, but if we are not going to use the first variable in further execution, then unnecessary we are consuming a space for 1st variable in the VariableSpace. In such cases, user will drop the 1st variable. This particular feature/mechanism is useful for a queries which returns a single row. So user will be using such technique for multiple queries. In such cases, user might need to create multiple variables. Hence I thoughts so. sorry, I don't understand now, when we set variable by empty value, then we remove variable ??? regards Pavel Let me know if such mechanism is already exists I am missing the same. On Mon, Nov 19, 2012 at 9:42 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Karl O. Pinc k...@meme.com writes: Yes. I'm wrong. For some reason I thought you could use DO to make an anonymous code block that would act as a SETOF function, allowing RETURN NEXT expr (et-al) to be used in the plpgsql code, allowing DO to return table results. (Or, perhaps, instead, be used in place of a table in a SELECT statement.) Oh well. My key for remembering about that point is that DO is a utility command, not a query. Now, the proposal I pushed last time we opened that very can of worms was to have inline functions rather than anonymous code blocks: WITH FUNCTION foo(integer) returns bigint language SQL AS $$ SELECT $1 + 1; $$, Not sure how much that relates to $topic, but still something that raises in my mind with enough presence that I need to write about it so that it stops calling for attention :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- -- Piyush S Newe Principal Engineer EnterpriseDB office: +91 20 3058 9500 www.enterprisedb.com Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
Hello all 2012/11/27 Jeff Davis pg...@j-davis.com: On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote: It would be useful if we issued a NOTICE when an ambiguity is introduced, rather than when using it. Like Bison's reporting of reduce conflicts. This brings up a very important point, which is that a lot of the code is frozen in applications yet invisible at DDL time. So we have to be careful that DDL changes have a reasonable impact on the ability to continue to compile and execute the previously-working SQL received from the applications. In other words, as I said in another reply, we want to avoid cases where something seemingly innocuous (like creating a function) causes previously-working SQL to fail due to ambiguity. As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not suggesting that. And I know that creating a function can already cause previously-working SQL to fail. I'm just saying we should be careful of these situations and not make them more likely than necessary. from my view - a current design works well, but for someone who see pg first time, there can be lot of surprises. a) PostgreSQL reports missing functions -- but there are issue in parameters b) PostgreSQL requests explicit typing string literals to text -- and again it reports not informative message so minimally we can enhance a error messages Regards Pavel Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-26 16:39:39 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-10-31 11:41:37 +0530, Amit Kapila wrote: There seems to be a problem in behavior of Create Index Concurrently and Hot Update in HEAD code . At pgcon.it I had a chance to discuss with Simon how to fix this bug. Please check the attached patches - and their commit messages - for the fix and some regression tests. I looked at this a bit. I am very unhappy with the proposed kluge to open indexes NoLock in some places. Even if that's safe today, which I don't believe in the least, any future change in this area could break it. I am not happy with it either. But I failed to see a better way. Note that in most of the cases a lock actually is acquired shortly afterwards, just a -indisvalid or an -indisready check away. The only exception is RelationGetIndexAttrBitmap which actually needs to look at !indisvalid !indisready indexes because of HOT. All the former cases could just do a syscache lookup beforehand and skip if it doesn't match their criterion. I wasn't sure about the (performance, notational) overhead of doing a second syscache lookup in those paths. Note that any -indisvalid/indisready change is protected by waiting for all concurrent backends to finish, so I don't think the separate syscache lookup/index_open would be a problem. For RelationGetIndexAttrBitmap, I don't really see a point in the locks acquired - after all were not protecting the RelationGetIndexList either. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Peter Eisentraut pete...@gmx.net writes: I'm not fond of overloading LOAD as the refresh command. Maybe you could go the Oracle route here as well and use a stored procedure. That would also allow things like SELECT pg_refresh_mv(oid) FROM ... more easily. I would like that we have a way to refresh a Materialized View by calling a stored procedure, but I don't think it should be the main UI. The wholesale refreshing of a matview appears to me to be comparable to TRUNCATE is that it's both a DDL and a DML. The incremental refreshing modes we want to have later are clearly DML only, either on commit refresh or incrementally on demand. I would then propose that we use ALTER MATERIALIZED VIEW as the UI for the wholesale on demand refreshing operation, and UPDATE MATERIALIZED VIEW as the incremental command (to come later). So my proposal for the current feature would be: ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ]; UPDATE MATERIALIZED VIEW mv; The choice of keywords and syntax here hopefully clearly hint the user about the locking behavior of the commands, too. And as we said, the bare minimum for this patch does *not* include the CONCURRENTLY option, which we still all want to have (someday). :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-26 22:31:50 -0500, Tom Lane wrote: I wrote: I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an additional step that somehow marks the pg_index row in a new way that makes RelationGetIndexList ignore it, and then wait out existing transactions before taking the final step of dropping the index. The Right Way to do this would likely have been to add another bool column, but we don't have that option anymore in 9.2. Maybe we could abuse indcheckxmin somehow, ie use a state that can't otherwise occur (in combination with the values of indisvalid/indisready) to denote an index being dropped. I looked into this some more. There are currently three possible states of the indisvalid/indisready flags: indisvalid = false, indisready = false initial state during CREATE INDEX CONCURRENTLY; this should result in sessions honoring the index for HOT-safety decisions, but not using it for searches or insertions indisvalid = false, indisready = true sessions should now insert into the index, but not use it for searches indisvalid = true, indisready = true normal, fully valid index Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Obviously, an additional flag column would be a much cleaner fix, and I think we should add one in HEAD. But it's too late to do that in 9.2. While I aggree that more states would make some stuff cleaner, as long as were still locking entries in RelationGetIndexAttrBitmap that have indisvalid = false, indisready = false we still have broken DROP INDEX CONCURRENTLY due to the exlusive lock acquired during actually dropping the index. Which can take quite a while on a big index. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
2012/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr: Peter Eisentraut pete...@gmx.net writes: I'm not fond of overloading LOAD as the refresh command. Maybe you could go the Oracle route here as well and use a stored procedure. That would also allow things like SELECT pg_refresh_mv(oid) FROM ... more easily. I would like that we have a way to refresh a Materialized View by calling a stored procedure, but I don't think it should be the main UI. The wholesale refreshing of a matview appears to me to be comparable to TRUNCATE is that it's both a DDL and a DML. The incremental refreshing modes we want to have later are clearly DML only, either on commit refresh or incrementally on demand. I would then propose that we use ALTER MATERIALIZED VIEW as the UI for the wholesale on demand refreshing operation, and UPDATE MATERIALIZED VIEW as the incremental command (to come later). So my proposal for the current feature would be: ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ]; UPDATE MATERIALIZED VIEW mv; The choice of keywords and syntax here hopefully clearly hint the user about the locking behavior of the commands, too. And as we said, the bare minimum for this patch does *not* include the CONCURRENTLY option, which we still all want to have (someday). :) +1 Pavel Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
Peter Eisentraut pete...@gmx.net writes: Sure, PQping is useful for this very specific use case of seeing whether the server has finished starting up. If someone came with a plausible Rename the utility to pg_isready? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote: On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Yeah, this looks much better, given our inability to add a new catalog column in 9.2. Can we cheat a little though and use a value other than 0 and 1 for indisvalid or indisready to tell the server to interpret it differently ? I would assume not, but can't see a reason unless these values are converted to other types and back to boolean. Andres complained about the fact that many callers of RelationGetIndexList are probably not ready to process invalid or not-yet-ready indexes and suggested that those changes should be backpatched to even older releases. But IMO we should do that with a test case that demonstrates that there is indeed a bug. I haven't yet looked deeply enough to judge whether there are actually bugs. But I can say that the e.g. the missing indisvalid checks in transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking whether indexes are ready isn't nice either. Also, we should teach RelationGetIndexList to take a flags argument and filter out indexes that the caller is not interested instead of every caller doing the checks separately. Given that RelationGetIndexList currently is just returning a cached list I don't see how thats going to work without significant overhead. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 11:48:08 +0100, Andres Freund wrote: On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote: On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Yeah, this looks much better, given our inability to add a new catalog column in 9.2. Can we cheat a little though and use a value other than 0 and 1 for indisvalid or indisready to tell the server to interpret it differently ? I would assume not, but can't see a reason unless these values are converted to other types and back to boolean. Andres complained about the fact that many callers of RelationGetIndexList are probably not ready to process invalid or not-yet-ready indexes and suggested that those changes should be backpatched to even older releases. But IMO we should do that with a test case that demonstrates that there is indeed a bug. I haven't yet looked deeply enough to judge whether there are actually bugs. But I can say that the e.g. the missing indisvalid checks in transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking whether indexes are ready isn't nice either. At least the former was easy enough to verify after thinking about it for a minute (=9.1): CREATE TABLE clusterbug(id serial primary key, data int); INSERT INTO clusterbug(data) VALUES(2),(2); CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data); CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references clusterbug(data)); Now a !indisready index is getting queried (strangely enough that doesn't cause an error). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 11:52:11 +0100, Andres Freund wrote: On 2012-11-27 11:48:08 +0100, Andres Freund wrote: On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote: On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Yeah, this looks much better, given our inability to add a new catalog column in 9.2. Can we cheat a little though and use a value other than 0 and 1 for indisvalid or indisready to tell the server to interpret it differently ? I would assume not, but can't see a reason unless these values are converted to other types and back to boolean. Andres complained about the fact that many callers of RelationGetIndexList are probably not ready to process invalid or not-yet-ready indexes and suggested that those changes should be backpatched to even older releases. But IMO we should do that with a test case that demonstrates that there is indeed a bug. I haven't yet looked deeply enough to judge whether there are actually bugs. But I can say that the e.g. the missing indisvalid checks in transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking whether indexes are ready isn't nice either. At least the former was easy enough to verify after thinking about it for a minute (=9.1): CREATE TABLE clusterbug(id serial primary key, data int); INSERT INTO clusterbug(data) VALUES(2),(2); CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data); CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references clusterbug(data)); Now a !indisready index is getting queried (strangely enough that doesn't cause an error). That should work in 9.2 as well, although its slightly more complicated. You just need a indisready !indisvalid index and the foreign key will happily be created. Easy enough to achieve with two sessions. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On Tue, Nov 27, 2012 at 4:22 PM, Andres Freund and...@2ndquadrant.comwrote: On 2012-11-27 11:48:08 +0100, Andres Freund wrote: I haven't yet looked deeply enough to judge whether there are actually bugs. But I can say that the e.g. the missing indisvalid checks in transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking whether indexes are ready isn't nice either. At least the former was easy enough to verify after thinking about it for a minute (=9.1): CREATE TABLE clusterbug(id serial primary key, data int); INSERT INTO clusterbug(data) VALUES(2),(2); CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data); CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references clusterbug(data)); Now a !indisready index is getting queried (strangely enough that doesn't cause an error). There might be a bug there as you are suggesting, but for me the CREATE INDEX itself fails (and rightly so) because of duplicate keys. Do I need to run these statements in different sessions ? Thanks, Pavan
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-26 22:44:49 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-10-05 19:56:40 -0400, Tom Lane wrote: I think this could possibly be fixed by using nontransactional update-in-place when we're trying to change indisvalid and/or indisready, but I've not really thought through the details. I couldn't really think of any realistic method to fix this other than update in place. I thought about it for a while and I think it should work, but I have to say it makes me slightly uneasy. I looked through the code a bit, and I think we should be able to make this work, but note there are also places that update indcheckxmin using heap_update, and that's just as dangerous as changing the other two flags via heap_update, if you don't have exclusive lock on the table. This is going to need some careful thought, because we currently expect that such an update will set the pg_index row's xmin to the current XID, which is something an in-place update can *not* do. I think this is a non-problem during construction of a new index, since the xmin ought to be the current XID already anyway, but it's less clear what to do in REINDEX. In the short term there may be no problem since REINDEX takes exclusive lock on the parent table anyway (and hence could safely do a transactional update) but if we ever want REINDEX CONCURRENTLY we'll need a better answer. Isn't setting indexcheckxmin already problematic in the case of CREATE INDEX CONCURRENTLY? index_build already runs in a separate transaction there. I am really surprised that we haven't seen any evidence of a problem with this. On a database with a busy bigger catalog that ought to be a real problem. I wonder whether we couldn't fix it by introducing a variant/wrapper of heap_fetch et al. that follows t_ctid if the tuple became invisible recently. That ought to be able to fix most of these issues in a pretty general fashion. That would make a nice implementation of REINDEX CONCURRENTLY easier as well... Btw, even if we manage to find a sensible fix for this I would suggest postponing it after the next back branch release. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 16:39:58 +0530, Pavan Deolasee wrote: On Tue, Nov 27, 2012 at 4:22 PM, Andres Freund and...@2ndquadrant.comwrote: On 2012-11-27 11:48:08 +0100, Andres Freund wrote: I haven't yet looked deeply enough to judge whether there are actually bugs. But I can say that the e.g. the missing indisvalid checks in transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking whether indexes are ready isn't nice either. At least the former was easy enough to verify after thinking about it for a minute (=9.1): CREATE TABLE clusterbug(id serial primary key, data int); INSERT INTO clusterbug(data) VALUES(2),(2); CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data); CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references clusterbug(data)); Now a !indisready index is getting queried (strangely enough that doesn't cause an error). There might be a bug there as you are suggesting, but for me the CREATE INDEX itself fails (and rightly so) because of duplicate keys. Do I need to run these statements in different sessions ? Sure, the CREATE INDEX fails. The point is that we successfully can create the foreign key afterwards anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On Tue, Nov 27, 2012 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote: Pavan Deolasee pavan.deola...@gmail.com writes: On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false Yeah, this looks much better, given our inability to add a new catalog column in 9.2. Can we cheat a little though and use a value other than 0 and 1 for indisvalid or indisready to tell the server to interpret it differently ? No, not unless you'd like select * from pg_index to misbehave. Personally, I like being able to look at catalogs. Hmm.. I thought so. Thanks. If we add a new column to the catalog for HEAD, I think it will be a good idea to add an indflags kind of column which can store a bitmap of flags. We probably don't get into this kind of situation often, but if we do, then we can more flexibility without impacting rebuilding the catalogs. My two cents. Thanks, Pavan
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On Tue, Nov 27, 2012 at 4:49 PM, Andres Freund and...@2ndquadrant.comwrote: On 2012-11-27 16:39:58 +0530, Pavan Deolasee wrote: There might be a bug there as you are suggesting, but for me the CREATE INDEX itself fails (and rightly so) because of duplicate keys. Do I need to run these statements in different sessions ? Sure, the CREATE INDEX fails. The point is that we successfully can create the foreign key afterwards anyway. Ah OK. Got it. Thanks, Pavan
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: Observations and Comments --- - If no option is given to pg_computemaxlsn utility then we get a wrong error message ./pg_computemaxlsn ../data pg_computemaxlsn: file or directory (null) exists In my opinion we should consider the -P (Print max LSN from whole data directory) as default in case no option is specified, or otherwise throw a more descriptive error. Fixed. By considering default option as Whole data directory. - Options -p and -P should be mutually exclusive Fixed. Both options will not be allowed at the same time. - Long options missing for -p and -P Fixed. introduced long options as --path --data-directory. - Help string for -P should be something like print max LSN from whole data directory instead of print max LSN from whole database Fixed, by changing the message as suggested. - Help menu is silent about -V or --version option Fixed. - I think when finding the max LSN of single file the utility should consider all relation segments. IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory, it is okay to find only for that file. - There is no check for extra arguments e.g ./pg_computemaxlsn -P . ../data/ ../data/base/ ../data2/ Fixed. - Extra options gives error. - Multiple -p options also gives error. Eg: ./pg_computemaxlsn -p base/12286/12200 -p base/12286/12201 data - For -p {file | dir} option the utility expects the file path relative to the specified data directory path which makes thing little confusing for example ./pg_computemaxlsn -p data/base/12286/12200 data pg_computemaxlsn: file or directory data/base/12286/12200 does not exists although data/base/12286/12200 is valid file path but we gets file does not exists error As the path of file is relative to data directory, that's why in error it prints the path as data/base/12286/12200. Do you have any suggestion what should be done for this? - Utility should print the max LSN number in hexadecimal notation and LSN format (logid/segno) for consistency with PG, Fixed and may also print the file name which contains that LSN. It might be confusing to users for the cases where it has to follow link (pg_tblspc) and then print some absolute path which is not related to relative path given by user. Also I am not sure if it will be of any use to user. - When finding max LSN from directory, the utility does not considers the sub directories, which I think could be useful in some cases, like if we want to find the max LSN in tablespace directory. With the current implementation we would require to run the utility multiple times, once for each database. It will consider sub-directories as well. Code Review - Code generally looks good, I have few small points. - In main() function variable maxLSN is defined as uint64, Although XLogRecPtr and uint64 are the same thing, but I think we should use XLogRecPtr instead. Changed as per suggestion. - Just a small thing although will hardly improve anything but still for sake of saving few cycles, I think you should use XLByteLT ( ) instead of XLByteLE ( = ) to find if the previous max LSN is lesser then current. Changed as per suggestion. - '\n' is missing from one the printf in the code :-) fprintf(stderr, _(%s: file or directory \%s\ does not exists), progname, LsnSearchPath); Changed as per suggestion. - While finding the max LSN from single file, you are not verifying that the file actually exists inside the data directory path provided. and can end up looking at wrong data directory for checking if the server is running. For example: bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 $PWD/otherinstallation/data Maximum LSN found is: 0, WAL segment file name (fileid,seg): - Code is not verifying, if the provided data directory is the valid data directory, This could make pg_computemaxlsn to compute max LSN from the running server file. For example: bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/ Maximum LSN found is: 0, WAL segment file name (fileid,seg): Fixed, by allowing only relative path for file or directory and check if that is relative and below data directory with function path_is_relative_and_below_cwd(). Questions - - I am wondering why are you not following the link inside the pg_tblspc directory to get to the table space location instead of building the directory path. I am thinking if you follow the link instead, The utility can be used across the different versions of PG. Changed as per suggestion. Apart from above, updated the documentation. With Regards, Amit Kapila. pg_computemaxlsn.v2.patch Description: pg_computemaxlsn.v2.patch -- Sent via pgsql-hackers mailing list
Re: [HACKERS] [WIP] pg_ping utility
On Tue, Nov 27, 2012 at 7:35 PM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Peter Eisentraut pete...@gmx.net writes: Sure, PQping is useful for this very specific use case of seeing whether the server has finished starting up. If someone came with a plausible Rename the utility to pg_isready? +1, the current version of the patch is already fitted for that and would not need extra options like the number of packages sent. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] review: pgbench - aggregation of info written into log
Hello 2012/11/12 Tomas Vondra t...@fuzzy.cz: Hi, attached is a v4 of the patch. There are not many changes, mostly some simple tidying up, except for handling the Windows. After a bit more investigation, it seems to me that we can't really get the same behavior as in other systems - basically the timestamp is unavailable so we can't log the interval start. So it seems to me the best we can do is to disable this option on Windows (and this is done in the patch). So when you try to use --aggregate-interval on Win it will complain it's an unknown option. Now that I think of it, there might be a better solution that would not mimic the Linux/Unix behaviour exactly - we do have elapsed time since the start of the benchmark, so maybe we should use this instead of the timestamp? I mean on systems with reasonable gettimeofday we'd get 1345828501 5601 1542744 483552416 61 2573 1345828503 7884 1979812 565806736 60 1479 1345828505 7208 1979422 567277552 59 1391 1345828507 7685 1980268 569784714 60 1398 1345828509 7073 1979779 573489941 236 1411 but on Windows we'd get 0 5601 1542744 483552416 61 2573 2 7884 1979812 565806736 60 1479 4 7208 1979422 567277552 59 1391 6 7685 1980268 569784714 60 1398 8 7073 1979779 573489941 236 1411 i.e. the first column is seconds since start of the test. H, and maybe we could call 'gettimeofday' at the beginning, to get the timestamp of the test start (AFAIK the issue on Windows is the resolution, but it should be enough). Then we could add it up with the elapsed time and we'd get the same output as on Linux. And maybe this could be done in regular pgbench execution too? But I really need help from someone who knows Windows and can test this. Comments regarding Pavel's reviews are below: On 2.10.2012 20:08, Pavel Stehule wrote: Hello I did review of pgbench patch http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php * this patch is cleanly patched * I had a problem with doc make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml' openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml openjade:pgbench.sgml:767:8:E: document type does not allow element TITLE here; missing one of ABSTRACT, AUTHORBLURB, MSGSET, CALLOUTLIST, ITEMIZEDLIST, ORDEREDLIST, SEGMENTEDLIST, VARIABLELIST, CAUTION, IMPORTANT, NOTE, TIP, WARNING, FORMALPARA, BLOCKQUOTE, EQUATION, EXAMPLE, FIGURE, TABLE, PROCEDURE, SIDEBAR, QANDASET, REFSECT3 start-tag make[1]: *** [HTML.index] Error 1 make[1]: *** Deleting file `HTML.index' make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml Hmmm, I've never got the docs to build at all, all I do get is a heap of some SGML/DocBook related issues. Can you investigate a bit more where's the issue? I don't remember messing with the docs in a way that might cause this ... mostly copy'n'paste edits. * pgbench is compiled without warnings * there is a few issues in source code + + /* should we aggregate the results or not? */ + if (use_log_agg) + { + + /* are we still in the same interval? if yes, accumulate the + * values (print them otherwise) */ + if (agg-start_time + use_log_agg = INSTR_TIME_GET_DOUBLE(now)) + { + Errr, so what are the issues here? using a use_log_agg variable - bad name for variable - it is fixed * a real time range for aggregation is dynamic (pgbench is not realtime application), so I am not sure if following constraint has sense +if ((duration 0) (use_log_agg 0) (duration % use_log_agg != 0)) { + fprintf(stderr, duration (%d) must be a multiple of aggregation interval (%d)\n, duration, use_log_agg); + exit(1); + } I'm not sure what might be the issue here either. If the test duration is set (using -T option), then this kicks in and says that the duration should be a multiple of aggregation interval. Seems like a sensible assumption to me. Or do you think this is check should be removed? * it doesn't flush last aggregated data (it is mentioned by Tomas: Also, I've realized the last interval may not be logged at all - I'll take a look into this in the next version of the patch.). And it can be significant for higher values of use_log_agg Yes, and I'm still not sure how to fix this in practice. But I do have this on my TODO. * a name of variable use_log_agg is not good - maybe log_agg_interval ? I've changed it to agg_interval. ok issues: * empty lines with invisible chars (tabs) + and sometimes empty lines after and before {} * adjustment of start_time + * the desired interval */ +
Re: [HACKERS] Materialized views WIP patch
Pavel Stehule wrote: 2012/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr: I would like that we have a way to refresh a Materialized View by calling a stored procedure, but I don't think it should be the main UI. I agree. I saw that Oracle uses a function for that without any statement-level support, and that would probably be easier to implement; but it felt wrong to do it that way. I couldn't think of any other cases where similar action is taken without statement syntax for it. The wholesale refreshing of a matview appears to me to be comparable to TRUNCATE is that it's both a DDL and a DML. The incremental refreshing modes we want to have later are clearly DML only, either on commit refresh or incrementally on demand. Personally, I expect the most popular update method to eventually become a queued update. I've looked ahead far enough to see that I want to structure the incremental updates to be controlled through an API where changes to supporting tables produce records saying what was done which are fed to consumers which do the updating. Then it becomes a matter of whether that consumer performs the related updates to the MV during commit processing of the producer, by pulling from a queue, or by reading accumulated rows when the MV is referenced. But I'm getting ahead of things with such speculation... I would then propose that we use ALTER MATERIALIZED VIEW as the UI for the wholesale on demand refreshing operation, and UPDATE MATERIALIZED VIEW as the incremental command (to come later). Honestly, I have managed to keep myself from speculating on syntax for incremental updates. There will be enough complexity involved that I expect months of bikeshedding. :-/ So my proposal for the current feature would be: ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ]; UPDATE MATERIALIZED VIEW mv; An ALTER MATERIALIZED VIEW option was my first thought on syntax to do what LOAD does in the current patch. But it bothered me that I couldn't think of any other cases where ALTER some-object-type only changed the data contained within the object and had no other impact. Are you both really comfortable with an ALTER MATERIALIZED VIEW which has no effect other than to update the data? It seems wrong to me. The choice of keywords and syntax here hopefully clearly hint the user about the locking behavior of the commands, too. And as we said, the bare minimum for this patch does *not* include the CONCURRENTLY option, which we still all want to have (someday). :) +1 Sure -- a CONCURRENTLY option for LMV (or AMVU) seems like one of the next steps. I'll feel more confident about implementing that when it appears that we have shaken the last bugs out of CREATE/DROP INDEX CONCURRENTLY, since anything which affects those statements will probably also matter here. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: Deparsing DDL command strings
Hi, Pavel Stehule pavel.steh...@gmail.com writes: ** missing doc I still would prefer to have an ack about the premises of this patch before spending time on writing docs for it, namely: - we want a normalized command string (schema, auto naming of some objects such as constraints and indexes, exporting default values, etc); - this normalisation can not happen as an extension given the current source code organisation and the new ddl_rewrite module is a good fit to solve that problem; - we offer a new event alias ddl_command_trace that will get activated start of a DROP and end of an ALTER or a CREATE command so that it's easy to hook at the right place when you want to trace things; - it's now possible and easy to process GENERATED commands if you wish to do so (explicit sequence and index operations for a create table containing a serial primary key column). I think no complaints is encouraging though, so I will probably begin the docs effort soonish. ** CREATE FUNCTION is not supported I've added support for create and drop function in the attached patch. Not all commands are rewritten yet though, and not all of them even have support for the TG_OBJECTID, TG_OBJECTNAME and TG_SCHEMANAME variables. I think we could still apply that patch + docs and a limited set of commands, and continue filling the gaps till the end of this cycle. Once we agree on how to attack the problem at hand, do we really need support for ALTER OPERATOR FAMILY for an intermediate commit? You tell me. * some wrong in deparsing - doesn't support constraints I've been fixing that and reviewing ALTER TABLE rewriting, should be ok now. Note that we have to analyze the alter table work queue, not the parsetree, if we want to normalize automatic constraints names and such like. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support event_trigger_infos.2.patch.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DEALLOCATE IF EXISTS
On 09.10.2012 17:44, Vik Reykja wrote: On Tue, Oct 9, 2012 at 4:09 PM, Tom Lanet...@sss.pgh.pa.us wrote: =?ISO-8859-1?Q?S=E9bastien_Lardi=E8re?=slardi...@hi-media.com writes: Indeed, brackets was not correct, it's better now (I think), and correct some comments. Still wrong ... at the very least you missed copyfuncs/equalfuncs. In general, when adding a field to a struct, it's good practice to grep for all uses of that struct. I don't see Sébastien's message, but I made the same mistake in my patch. Another one is attached with copyfuncs and equalfuncs. I did a grep for DeallocateStmt and I don't believe I have missed anything else. Also, I'm changing the subject so as not to hijack this thread any further. I fail to see the point of DEALLOCATE IF EXISTS. Do you have real use case for this, or was this just a case of adding IF EXISTS to all commands for the sake of completeness? Usually the client knows what statements have been prepared, but perhaps you want to make sure everything is deallocated in some error handling case or similar. But in that case, you might as well just issue a regular DEALLOCATE and ignore errors. Or even more likely, you'll want to use DEALLOCATE ALL. - Heikki -- - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Nov 27, 2012, at 5:25, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So my proposal for the current feature would be: ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ]; UPDATE MATERIALIZED VIEW mv; The choice of keywords and syntax here hopefully clearly hint the user about the locking behavior of the commands, too. And as we said, the bare minimum for this patch does *not* include the CONCURRENTLY option, which we still all want to have (someday). :) I dislike using ALTER syntax to perform a data-only action. The other advantage of non-functional syntax is that you could more easily supply some form of where clause should you only want to perform a partial refresh. With a function call that becomes more obtuse. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Kevin Grittner kgri...@mail.com writes: An ALTER MATERIALIZED VIEW option was my first thought on syntax to do what LOAD does in the current patch. But it bothered me that I couldn't think of any other cases where ALTER some-object-type only changed the data contained within the object and had no other impact. Are you both really comfortable with an ALTER MATERIALIZED VIEW which has no effect other than to update the data? It seems wrong to me. I think you can already do that with some clever use of alter table ... type using, or alter table set default. Sure -- a CONCURRENTLY option for LMV (or AMVU) seems like one of the next steps. I'll feel more confident about implementing that when it appears that we have shaken the last bugs out of CREATE/DROP INDEX CONCURRENTLY, since anything which affects those statements will probably also matter here. Sure. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Dimitri Fontaine wrote: Kevin Grittner kgri...@mail.com writes: An ALTER MATERIALIZED VIEW option was my first thought on syntax to do what LOAD does in the current patch. But it bothered me that I couldn't think of any other cases where ALTER some-object-type only changed the data contained within the object and had no other impact. Are you both really comfortable with an ALTER MATERIALIZED VIEW which has no effect other than to update the data? It seems wrong to me. I think you can already do that with some clever use of alter table ... type using, or alter table set default. You mean, specifying an ALTER TABLE which appears to specify a change to some non-data aspect of a table but which is really setting it to the existing state? And it therefore rewrites the table? I suppose that with USING you could actually even have it rewritten with data different from what was there before without changing the structure of the table. Somehow I don't find that pursuasive as an argument for what ALTER MATERIALIZED VIEW should rescan the source relations and build a whole new set of data for exactly the same MV definition. Consider that in relational theory a table is considered a relation variable. ALTER is supposed to change the definition of the variable in some way. Other statements are used to change the value contained in the variable. Sure there are some grey areas already, but I don't see where we need to muddy the waters in this case. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
On Tue, Nov 27, 2012 at 8:45 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Nov 27, 2012 at 7:35 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Peter Eisentraut pete...@gmx.net writes: Sure, PQping is useful for this very specific use case of seeing whether the server has finished starting up. If someone came with a plausible Rename the utility to pg_isready? +1, the current version of the patch is already fitted for that and would not need extra options like the number of packages sent. I am 100% fine with this. I can make this change tomorrow. -- Michael Paquier http://michael.otacoo.com Sorry I haven't jumped in on this thread more, I have limited connectivity where I am. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Kevin Grittner kgri...@mail.com writes: changing the structure of the table. Somehow I don't find that pursuasive as an argument for what ALTER MATERIALIZED VIEW should rescan the source relations and build a whole new set of data for exactly the same MV definition. Fair enough. Consider that in relational theory a table is considered a relation variable. ALTER is supposed to change the definition of the variable in some way. Other statements are used to change the value contained in the variable. Sure there are some grey areas already, but I don't see where we need to muddy the waters in this case. Under that light, using ALTER is strange indeed. I still don't like using LOAD that much, allow me to try a last syntax proposal. Well all I can find just now would be: UPDATE MATERIALIZED VIEW mv FOR EACH ROW; UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ]; The only value of such a proposal is that it's not LOAD and it's still not introducing any new keyword. Oh it's also avoiding to overload the SNAPSHOT keyword. Well, it still does not look like the best candidate. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks
Alvaro Herrera wrote: Here's version 24. Old review emails still contains some items that didn't lead to any changes. I tried to keep close track of those. To that list I add a couple of things of my own. Here they are, for those following along. I welcome suggestions. - Fix the multixact cache in multixact.c -- see mXactCacheEnt. - ResetHintBitMulti() was removed; need to remove old XMAX_IS_MULTI somewhere; perhaps during HOT page pruning? - EvalPlanQual and ExecLockRows maybe need a different overall locking strategy. Right now follow_updates=false is used to avoid deadlocks. - Ensure multixact.c behaves sanely on wraparound. - Is the case of a a non-key-update followed by a key-update actually handled when doing a heap_lock_tuple with mode = LockTupleKeyShare and follow_updates = false? I don't really see how, so it seems to deserve at least a comment. - if oldestMultiXactId + db is set and then that database is dropped we seem to have a problem because MultiXactAdvanceOldest won't overwrite those values. Should probably use SetMultiXactIdLimit directly. - what stop multixacts only being filled out (i.e RecordNewMultiXact()) *after* the XLogInsert() *and* after a MultiXactGetCheckptMulti()? Afaics MultiXactGenLock is not hold in CreateMultiXactId(). If we crash in that moment we loose the multixact data which now means potential data loss... - multixact member group data crossing 512 sector boundaries makes me uneasy (as its 5 bytes). I don't really see a scenario where its dangerous, but ... Does anybody see a problem here? - there are quite some places that do multiStopLimit = multiWrapLimit - 100; if (multiStopLimit FirstMultiXactId) multiStopLimit -= FirstMultiXactId; perhaps MultiXactIdAdvance and MultiXactIdRetreat macros are in order? - not really padding, MultiXactStatus is 4bytes... /* * XXX Note: there's a lot of padding space in MultiXactMember. We could * find a more compact representation of this Xlog record -- perhaps all the * status flags in one XLogRecData, then all the xids in another one? Not * clear that it's worth the trouble though. */ -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Dimitri Fontaine dimi...@2ndquadrant.fr writes: Under that light, using ALTER is strange indeed. Agreed, seems like a poor choice. I still don't like using LOAD that much, allow me to try a last syntax proposal. Well all I can find just now would be: UPDATE MATERIALIZED VIEW mv FOR EACH ROW; UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ]; The only value of such a proposal is that it's not LOAD and it's still not introducing any new keyword. Oh it's also avoiding to overload the SNAPSHOT keyword. Well, it still does not look like the best candidate. I think this syntax would require making MATERIALIZED (and possibly also VIEW) fully reserved keywords, which would be better avoided. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
On Mon, Nov 19, 2012 at 2:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Davis pg...@j-davis.com writes: As I said elsewhere in the thread, I'm not planning to introduce any additional locking. There is already precedent in IndexOnlyNext. Of course, that just begs the question of whether the code in IndexOnlyNext is correct. And after looking at it, it seems pretty broken, or at least the argument for its correctness is broken. That comment seems to consider only the case where the target tuple has just been inserted --- but what of the case where the target tuple is in process of being deleted? The deleter will not make a new index entry for that. Of course, there's a pretty long code path from marking a tuple deleted to committing the deletion, and it seems safe to assume that the deleter will hit a write barrier in that path. But I don't believe the argument here that the reader must hit a read barrier in between. Ouch. I don't know how I missed that case when writing that comment, but I agree it should have been discussed there. The timing requirements for updates and deletes seem to be generally less strict, because the update or delete won't generally be visible to our snapshot anyway, so it doesn't really matter if we think the page is all-visible for slightly longer than it really is. However, it would be a problem if we thought the page was still all-visible when in fact it contained a tuple that was not visible to our snapshot. For that to happen, the updating or deleting transaction would have to commit before we took our snapshot, yet the visibility map update would have to still be visible after we took our snapshot. That clearly can't happen, since both committing and taking a snapshot require LWLock acquire/release. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Dimitri Fontaine Sent: Tuesday, November 27, 2012 10:03 AM To: Kevin Grittner Cc: Pavel Stehule; Peter Eisentraut; Pgsql Hackers Subject: Re: [HACKERS] Materialized views WIP patch Kevin Grittner kgri...@mail.com writes: changing the structure of the table. Somehow I don't find that pursuasive as an argument for what ALTER MATERIALIZED VIEW should rescan the source relations and build a whole new set of data for exactly the same MV definition. Fair enough. Consider that in relational theory a table is considered a relation variable. ALTER is supposed to change the definition of the variable in some way. Other statements are used to change the value contained in the variable. Sure there are some grey areas already, but I don't see where we need to muddy the waters in this case. Under that light, using ALTER is strange indeed. I still don't like using LOAD that much, allow me to try a last syntax proposal. Well all I can find just now would be: UPDATE MATERIALIZED VIEW mv FOR EACH ROW; UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ]; The only value of such a proposal is that it's not LOAD and it's still not introducing any new keyword. Oh it's also avoiding to overload the SNAPSHOT keyword. Well, it still does not look like the best candidate. Regards, Just a thought but how about something like: DO REFRESH OF MATERIALIZED VIEW mat_view; In effect we begin overloading the meaning of DO to not only mean anonymous code blocks but to also call pre-defined internal routines that can be executed without having to use function-call syntax. MATERIALIZED VIEW can be more generic e.g., TABLE if the need arises, the REFRESH Action is generic, and additional clauses can be added after the object name (FOR, CONCURRENTLY, WHERE, etc...) David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
2012/11/27 David Johnston pol...@yahoo.com: -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Dimitri Fontaine Sent: Tuesday, November 27, 2012 10:03 AM To: Kevin Grittner Cc: Pavel Stehule; Peter Eisentraut; Pgsql Hackers Subject: Re: [HACKERS] Materialized views WIP patch Kevin Grittner kgri...@mail.com writes: changing the structure of the table. Somehow I don't find that pursuasive as an argument for what ALTER MATERIALIZED VIEW should rescan the source relations and build a whole new set of data for exactly the same MV definition. Fair enough. Consider that in relational theory a table is considered a relation variable. ALTER is supposed to change the definition of the variable in some way. Other statements are used to change the value contained in the variable. Sure there are some grey areas already, but I don't see where we need to muddy the waters in this case. Under that light, using ALTER is strange indeed. I still don't like using LOAD that much, allow me to try a last syntax proposal. Well all I can find just now would be: UPDATE MATERIALIZED VIEW mv FOR EACH ROW; UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ]; The only value of such a proposal is that it's not LOAD and it's still not introducing any new keyword. Oh it's also avoiding to overload the SNAPSHOT keyword. Well, it still does not look like the best candidate. Regards, Just a thought but how about something like: DO REFRESH OF MATERIALIZED VIEW mat_view; In effect we begin overloading the meaning of DO to not only mean anonymous code blocks but to also call pre-defined internal routines that can be executed without having to use function-call syntax. MATERIALIZED VIEW can be more generic e.g., TABLE if the need arises, the REFRESH Action is generic, and additional clauses can be added after the object name (FOR, CONCURRENTLY, WHERE, etc...) -1 I unlike using keywords DO for this purpose - when we use it for anonymous blocks Regards Pavel David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ilist.h fails cpluspluscheck
Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 01:14:27 -0500, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: Maybe some ifndef __cplusplus would help. Its rather easy to fix in the ilist code at least - the cases it points out are those where I took a slightly ugly shortcut to reduce some very minor code duplication... Some casts fix it, Seems reasonable, committed. I wonder if we could get a buildfarm member or two running cpluspluscheck. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting *_desc routines
Alvaro Herrera alvhe...@2ndquadrant.com writes: Tom Lane escribió: FWIW, I'd vote for dumping all of these into *one* rmgrdesc directory (which may as well be under access/ since that's where the xlog code is), regardless of where the corresponding replay code is in the source tree. I don't think splitting them up into half a dozen directories adds anything except confusion. If you want, the header comment for each file could mention where the corresponding replay code lives. Done that way. Looks pretty sane to me. The only thing that seems worth discussing is whether to put the smgr-related XLOG record declarations in storage.h as you have here, or to invent a new, more private header for them. There are XLOG record declarations cluttering a lot of other fairly public headers; but given that we've invented files like heapam_xlog.h of late, maybe we should start trying to push those declarations to less-visible spots. Or maybe it's not worth the trouble. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 01:59:04AM -0800, Jeff Davis wrote: On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote: It would be useful if we issued a NOTICE when an ambiguity is introduced, rather than when using it. Like Bison's reporting of reduce conflicts. This brings up a very important point, which is that a lot of the code is frozen in applications yet invisible at DDL time. So we have to be careful that DDL changes have a reasonable impact on the ability to continue to compile and execute the previously-working SQL received from the applications. In other words, as I said in another reply, we want to avoid cases where something seemingly innocuous (like creating a function) causes previously-working SQL to fail due to ambiguity. As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not suggesting that. And I know that creating a function can already cause previously-working SQL to fail. I'm just saying we should be careful of these situations and not make them more likely than necessary. For me this highlights why looking at how application languages handle overloading might not be as relevant --- most language don't have possible-conflicting functions being created at run-time like a database does. The parallels in how other databases treat overloading is relevant. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
2012/11/27 Bruce Momjian br...@momjian.us: On Tue, Nov 27, 2012 at 01:59:04AM -0800, Jeff Davis wrote: On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote: It would be useful if we issued a NOTICE when an ambiguity is introduced, rather than when using it. Like Bison's reporting of reduce conflicts. This brings up a very important point, which is that a lot of the code is frozen in applications yet invisible at DDL time. So we have to be careful that DDL changes have a reasonable impact on the ability to continue to compile and execute the previously-working SQL received from the applications. In other words, as I said in another reply, we want to avoid cases where something seemingly innocuous (like creating a function) causes previously-working SQL to fail due to ambiguity. As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not suggesting that. And I know that creating a function can already cause previously-working SQL to fail. I'm just saying we should be careful of these situations and not make them more likely than necessary. For me this highlights why looking at how application languages handle overloading might not be as relevant --- most language don't have possible-conflicting functions being created at run-time like a database does. The parallels in how other databases treat overloading is relevant. it is a basic problem - PostgreSQL has unique possibilities - polymorphic parameters and almost all databases doesn't support overloading probably our system is very similar to Haskell -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] binary heap implementation
[ Sorry for the slow response on this, Thanksgiving interfered. ] On Wed, Nov 21, 2012 at 3:41 PM, Andres Freund and...@2ndquadrant.com wrote: One very minor nitpick I unfortunately just found now, not sure when that changed: binaryheap_replace_first() hardcodes the indices for the left/right node of the root node. I would rather have it use (left|right)_offset(0). Hmm, yeah... but come to think of it, why do we need that special case at all? Why not just call sift_down on the root node and call it good? See the attached version, which simplifies the code considerably and also makes some comment adjustments per Abhijit's comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company binaryheap-rmh4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Andres Freund and...@2ndquadrant.com writes: On 2012-11-26 16:39:39 -0500, Tom Lane wrote: I looked at this a bit. I am very unhappy with the proposed kluge to open indexes NoLock in some places. Even if that's safe today, which I don't believe in the least, any future change in this area could break it. I am not happy with it either. But I failed to see a better way. Note that in most of the cases a lock actually is acquired shortly afterwards, just a -indisvalid or an -indisready check away. I think you're missing the point. The proposed patch will result in RelationGetIndexAttrBitmap trying to do index_open() on an index that might be getting dropped *right now* by a concurrent DROP INDEX CONCURRENTLY. If the DROP commits while index_open is running, its SnapshotNow catalog searches will stop finding relevant rows. From the point of view of index_open, the relation data structure is then corrupt (for example, it might not find pg_attribute entries for all the columns) and so it will throw an error. We need to fix things so that the final pre-deletion state of the pg_index row tells observer backends not to even try to open the index. (Thus, it will not matter whether they come across the pg_index row just before or just after the deleting transaction commits --- either way, they don't try to touch the index.) So that has to be a different state from the initial state used during CREATE INDEX CONCURRENTLY. The point of not wanting to open the index NoLock (and for that matter of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it thinks nobody is touching the index) is to make sure that if somehow somebody is touching the index anyway, they don't see the index's catalog entries as corrupt. They'll either all be there or all not. It's only a belt-and-suspenders safety measure, but I don't want to give it up. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 10:52 AM, Pavel Stehule pavel.steh...@gmail.com wrote: it is a basic problem - PostgreSQL has unique possibilities - polymorphic parameters and almost all databases doesn't support overloading Speaking of polymorphism, why not just implement lpad()'s first argument as 'anyelement'? ISTM this comes up in mostly in porting code from other database that is utilizing standard sql functions. This should be appropriate when the function's basic functionality and argument signature is not dependent on input type (constrast: to_timestamp) and there is a good portability case to be made. Essentially, this applies to a handful of string processing routines AFAICT. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On Tue, Nov 27, 2012 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: The point of not wanting to open the index NoLock (and for that matter of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it thinks nobody is touching the index) is to make sure that if somehow somebody is touching the index anyway, they don't see the index's catalog entries as corrupt. They'll either all be there or all not. It's only a belt-and-suspenders safety measure, but I don't want to give it up. +1. There's a whole crapload of commits that I did for 9.2 with commit messages like improve concurrent DDL in case XYZ. A lot of that had to do with fixing cases where we were examining system catalogs in unsafe ways before locks had been taken. I didn't manage to fix them all, unfortunately, but it's significantly better than it used to be, and I'd really like it if we could try not to go backwards. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 4:46 AM, Jeff Davis pg...@j-davis.com wrote: Let's say you have only one function foo. All your queries are coded into your application, and everything works fine, using assignment casts where necessary. Then the user is foolish enough to CREATE FUNCTION foo... and now their queries start failing left and right. In other words, only one possible candidate exists should be followed by right now to be more precise. That's a major violation of the principle of least astonishment, that CREATE FUNCTION could cause such a disaster. I know that it can now, but what you're proposing will come into play much more frequently because most people start off with just one function by a particular name and define more as needed. I admit that there are cases where this could happen, and that it will happen a little more than it does now. But, as you say, this can happen now, and yet we get very few if any complaints about it, whereas we get regular complaints about the need to insert casts that other database systems do not require. The fact is that most functions are not overloaded, so the esoterica of overloading affect only a tiny number of relatively sophisticated users. The need for extra casts cuts a much broader swath through our user base. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 12:02:09 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-26 16:39:39 -0500, Tom Lane wrote: I looked at this a bit. I am very unhappy with the proposed kluge to open indexes NoLock in some places. Even if that's safe today, which I don't believe in the least, any future change in this area could break it. I am not happy with it either. But I failed to see a better way. Note that in most of the cases a lock actually is acquired shortly afterwards, just a -indisvalid or an -indisready check away. I think you're missing the point. True. The proposed patch will result in RelationGetIndexAttrBitmap trying to do index_open() on an index that might be getting dropped *right now* by a concurrent DROP INDEX CONCURRENTLY. If the DROP commits while index_open is running, its SnapshotNow catalog searches will stop finding relevant rows. From the point of view of index_open, the relation data structure is then corrupt (for example, it might not find pg_attribute entries for all the columns) and so it will throw an error. We need to fix things so that the final pre-deletion state of the pg_index row tells observer backends not to even try to open the index. (Thus, it will not matter whether they come across the pg_index row just before or just after the deleting transaction commits --- either way, they don't try to touch the index.) So that has to be a different state from the initial state used during CREATE INDEX CONCURRENTLY. The point of not wanting to open the index NoLock (and for that matter of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it thinks nobody is touching the index) is to make sure that if somehow somebody is touching the index anyway, they don't see the index's catalog entries as corrupt. They'll either all be there or all not. It's only a belt-and-suspenders safety measure, but I don't want to give it up. So the idea would be to skip about-to-be-dropped indexes in RelationGetIndexList directly - we don't ever need to watch those, not even for HOT - because we only have the necessary knowledge there. The normal valid/ready checks will be done at the callsites of RelationGetIndexList(). But those checks can be done in a locked manner now. Are you already working on a fix? Or shall I work on an updated patch? I vote for introducing wrapper functions/macro to do the about-to-be-dropped check, its hard enough to understand as-is. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
Robert Haas robertmh...@gmail.com writes: I admit that there are cases where this could happen, and that it will happen a little more than it does now. But, as you say, this can happen now, and yet we get very few if any complaints about it, whereas we get regular complaints about the need to insert casts that other database systems do not require. The fact is that most functions are not overloaded, so the esoterica of overloading affect only a tiny number of relatively sophisticated users. The need for extra casts cuts a much broader swath through our user base. I find this argument a bit specious. It probably is true that most *user defined* functions aren't overloaded --- but that's not so true for system-defined functions, and even less true for operators. So the parser's behavior with overloaded calls affects all users, whether they know it or not. It also affects developers, in that adding a new overloaded version of a system function (that previously wasn't overloaded) could actually reduce the number of cases for which the function works without an explicit cast. We have got to be really careful with changing the parser's behavior here, or we're going to break cases that work today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem Observed in behavior of Create Index Concurrently and Hot Update
Andres Freund and...@2ndquadrant.com writes: So the idea would be to skip about-to-be-dropped indexes in RelationGetIndexList directly - we don't ever need to watch those, not even for HOT - because we only have the necessary knowledge there. The normal valid/ready checks will be done at the callsites of RelationGetIndexList(). But those checks can be done in a locked manner now. Right. Are you already working on a fix? Or shall I work on an updated patch? I'll work on it. I vote for introducing wrapper functions/macro to do the about-to-be-dropped check, its hard enough to understand as-is. Meh. If it's only going to be done in RelationGetIndexList, I'm not sure that a wrapper macro is worth the trouble. If we needed the test in multiple places I'd agree, but what other spots do you see? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 12:50:36 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I vote for introducing wrapper functions/macro to do the about-to-be-dropped check, its hard enough to understand as-is. Meh. If it's only going to be done in RelationGetIndexList, I'm not sure that a wrapper macro is worth the trouble. If we needed the test in multiple places I'd agree, but what other spots do you see? I don't really see any other querying locations - but such a macro would make the code easier backpatchable when we introduce a third column for the about-to-be-dropped case. Anyway, don't feel all too strongly about it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-11-26 22:44:49 -0500, Tom Lane wrote: I looked through the code a bit, and I think we should be able to make this work, but note there are also places that update indcheckxmin using heap_update, and that's just as dangerous as changing the other two flags via heap_update, if you don't have exclusive lock on the table. Isn't setting indexcheckxmin already problematic in the case of CREATE INDEX CONCURRENTLY? index_build already runs in a separate transaction there. Yeah, you are right, except that AFAICS indcheckxmin is really only needed for regular non-concurrent CREATE INDEX, which needs it because it commits without waiting for readers that might be bothered by broken HOT chains. In a concurrent CREATE INDEX, we handle that problem by waiting out all such readers before setting indisvalid. So the concurrent code path should not be bothering to set indcheckxmin at all, I think. (This is underdocumented.) Looking closer at the comment in reindex_index, what it's really full of angst about is that simple_heap_update will update the tuple's xmin *when we would rather that it didn't*. So switching to update-in-place there will solve a problem, not create one. In short, all flag changes in pg_index should be done by update-in-place, and the tuple's xmin will never change from the original creating transaction (until frozen). What we want the xmin to do, for indcheckxmin purposes, is reflect the time at which the index started to be included in HOT-safety decisions. Since we're never going to move the xmin, that means that *we cannot allow REINDEX to mark a dead index live again*. Once DROP INDEX CONCURRENTLY has reached the final state, you can't revalidate the index by reindexing it, you'll have to drop it and then make a brand new one. That seems like a pretty minor compromise. What I propose to do next is create a patch for HEAD that includes a new pg_index flag column, since I think the logic will be clearer with that. Once we're happy with that, we can back-port the patch into a form that uses the existing flag columns. Anybody feel like bikeshedding the flag column name? I'm thinking indislive but maybe there's a better name. Note: I'm not impressed by the proposal to replace these columns with a single integer flag column. Aside from any possible incompatibility with existing client code, it just isn't going to be easy to read the index's state manually if we do that. We could maybe dodge that complaint with a char (pseudo-enum) status column but I don't think that will simplify the code at all, and it's got the same or worse compatibility issues. Btw, even if we manage to find a sensible fix for this I would suggest postponing it after the next back branch release. AFAICS this is a data loss/corruption problem, and as such a must fix. If we can't have it done by next week, I'd rather postpone the releases until it is done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, 2012-11-27 at 12:19 -0500, Robert Haas wrote: I admit that there are cases where this could happen, and that it will happen a little more than it does now. But, as you say, this can happen now, and yet we get very few if any complaints about it, whereas we get regular complaints about the need to insert casts that other database systems do not require. The fact is that most functions are not overloaded, so the esoterica of overloading affect only a tiny number of relatively sophisticated users. The need for extra casts cuts a much broader swath through our user base. Well, I did offer a suggestion that would make your idea safer, which is to explicitly opt out of the overloading feature at the time the function is created, rather than making it implicit based on how many functions happen to have the same name. The fact that it can only hurt sophisticated users is not convincing to me. For one thing, our users are programmers, so they should all feel comfortable defining their own functions, and I don't want to make them any less so. Next, sophisticated users also make mistakes. I could also make a security argument. Even today, any user who can create a function in your search path can make your queries start failing. If we locked down most of the system-defined functions as non-overloadable, and allowed users to do the same for their functions (maybe even the default one day?), then that would greatly reduce the exposure. The current strictness of the overloaded functions tends to make users more explicit about argument types, which reduces the chance of problems at the expense of usability and compatibility. Not ideal, but if we make it more permissive then we are permanently stuck with less information about what types the user intended and which function they intended to call. In such an extensible system, that worries me on several fronts. That being said, I'm not outright in opposition to the idea of making improvements like this, I just think we should do so cautiously. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Enabling frontend-only xlog desc routines
I mentioned the remaining issues in a previous email (see message-id 20121025161751.ge6...@alvh.no-ip.org). Attached is a patch that enables xlogdump to #include xlog_internal.h by way of removing that file's inclusion of fmgr.h, which is problematic. I don't think this should be too contentious. The other interesting question remaining is what to do about the rm_desc function in rmgr.c. We're split between these two ideas: 1. Have this in rmgr.c: #ifdef FRONTEND #define RMGR_REDO_FUNC(func) NULL #else #define RMGR_REDO_FUNC(func) func #endif /* FRONTEND */ and then use RMGR_REDO_FUNC() in the table. 2. Have this in rmgr.c: #ifndef RMGR_REDO_FUNC #define RMGR_REDO_FUNC(func) func #endif And then have the xlogdump Makefile use -D to define a suitable RMGR_REDO_FUNC. Opinions please? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services commit 74986720979c1c2bb39a58133563fd8da82c301b Author: Alvaro Herrera alvhe...@alvh.no-ip.org Date: Tue Nov 27 15:33:18 2012 -0300 Split SQL-callable function declarations into xlog_fn.h This lets xlog_internal.h go without including fmgr.h, which is useful to let it compile in a frontend-only environment. diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c index d345761..40c0bd6 100644 --- a/src/backend/access/transam/xlogfuncs.c +++ b/src/backend/access/transam/xlogfuncs.c @@ -18,6 +18,7 @@ #include access/htup_details.h #include access/xlog.h +#include access/xlog_fn.h #include access/xlog_internal.h #include access/xlogutils.h #include catalog/catalog.h diff --git a/src/include/access/xlog_fn.h b/src/include/access/xlog_fn.h new file mode 100644 index 000..65376fe --- /dev/null +++ b/src/include/access/xlog_fn.h @@ -0,0 +1,35 @@ +/* + * xlog_fn.h + * + * PostgreSQL transaction log SQL-callable function declarations + * + * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/access/xlog_fn.h + */ +#ifndef XLOG_FN_H +#define XLOG_FN_H + +#include fmgr.h + +extern Datum pg_start_backup(PG_FUNCTION_ARGS); +extern Datum pg_stop_backup(PG_FUNCTION_ARGS); +extern Datum pg_switch_xlog(PG_FUNCTION_ARGS); +extern Datum pg_create_restore_point(PG_FUNCTION_ARGS); +extern Datum pg_current_xlog_location(PG_FUNCTION_ARGS); +extern Datum pg_current_xlog_insert_location(PG_FUNCTION_ARGS); +extern Datum pg_last_xlog_receive_location(PG_FUNCTION_ARGS); +extern Datum pg_last_xlog_replay_location(PG_FUNCTION_ARGS); +extern Datum pg_last_xact_replay_timestamp(PG_FUNCTION_ARGS); +extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS); +extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS); +extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS); +extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS); +extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS); +extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS); +extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS); +extern Datum pg_is_in_backup(PG_FUNCTION_ARGS); +extern Datum pg_backup_start_time(PG_FUNCTION_ARGS); + +#endif /* XLOG_FN_H */ diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index b70a620..5802d1d 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -17,7 +17,6 @@ #define XLOG_INTERNAL_H #include access/xlog.h -#include fmgr.h #include pgtime.h #include storage/block.h #include storage/relfilenode.h @@ -253,26 +252,4 @@ extern bool XLogArchiveCheckDone(const char *xlog); extern bool XLogArchiveIsBusy(const char *xlog); extern void XLogArchiveCleanup(const char *xlog); -/* - * These aren't in xlog.h because I'd rather not include fmgr.h there. - */ -extern Datum pg_start_backup(PG_FUNCTION_ARGS); -extern Datum pg_stop_backup(PG_FUNCTION_ARGS); -extern Datum pg_switch_xlog(PG_FUNCTION_ARGS); -extern Datum pg_create_restore_point(PG_FUNCTION_ARGS); -extern Datum pg_current_xlog_location(PG_FUNCTION_ARGS); -extern Datum pg_current_xlog_insert_location(PG_FUNCTION_ARGS); -extern Datum pg_last_xlog_receive_location(PG_FUNCTION_ARGS); -extern Datum pg_last_xlog_replay_location(PG_FUNCTION_ARGS); -extern Datum pg_last_xact_replay_timestamp(PG_FUNCTION_ARGS); -extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS); -extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS); -extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS); -extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS); -extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS); -extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS); -extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS); -extern Datum pg_is_in_backup(PG_FUNCTION_ARGS); -extern Datum pg_backup_start_time(PG_FUNCTION_ARGS); - #endif /* XLOG_INTERNAL_H */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I admit that there are cases where this could happen, and that it will happen a little more than it does now. But, as you say, this can happen now, and yet we get very few if any complaints about it, whereas we get regular complaints about the need to insert casts that other database systems do not require. The fact is that most functions are not overloaded, so the esoterica of overloading affect only a tiny number of relatively sophisticated users. The need for extra casts cuts a much broader swath through our user base. I find this argument a bit specious. It probably is true that most *user defined* functions aren't overloaded --- but that's not so true for system-defined functions, and even less true for operators. So the parser's behavior with overloaded calls affects all users, whether they know it or not. It also affects developers, in that adding a new overloaded version of a system function (that previously wasn't overloaded) could actually reduce the number of cases for which the function works without an explicit cast. We have got to be really careful with changing the parser's behavior here, or we're going to break cases that work today. Well, the whole point of writing the patch the way I did was that it *doesn't* break any cases that work today. But as to your point about the system catalogs, it is true that adding an additional function could reduce the number of cases where things work today. But I think in many cases it would eliminate the need for overloading that we already have, and simplify things for future developers. Right now, quote_literal() allows implicit casts to text by having a second version that takes any anyelement argument; on the other hand, concat() allows implicit casts to text by accepting any rather than text as an argument; and || allows implicit casts to text by defining operators for anynonarray || text, text || anynonarray, and text || text. So we've got three quite different methods to create implicit-cast-to-text behavior in particular cases. That's got developer complexity too, and while this proposal wouldn't do anything about the third case since || actually sometimes has a different meaning, namely array concatenation, the first two wouldn't need overloading any more. They'd just work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-27 13:45:08 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-26 22:44:49 -0500, Tom Lane wrote: I looked through the code a bit, and I think we should be able to make this work, but note there are also places that update indcheckxmin using heap_update, and that's just as dangerous as changing the other two flags via heap_update, if you don't have exclusive lock on the table. Isn't setting indexcheckxmin already problematic in the case of CREATE INDEX CONCURRENTLY? index_build already runs in a separate transaction there. Yeah, you are right, except that AFAICS indcheckxmin is really only needed for regular non-concurrent CREATE INDEX, which needs it because it commits without waiting for readers that might be bothered by broken HOT chains. In a concurrent CREATE INDEX, we handle that problem by waiting out all such readers before setting indisvalid. So the concurrent code path should not be bothering to set indcheckxmin at all, I think. (This is underdocumented.) Seems to be correct to me. Looking closer at the comment in reindex_index, what it's really full of angst about is that simple_heap_update will update the tuple's xmin *when we would rather that it didn't*. So switching to update-in-place there will solve a problem, not create one. It strikes me that the whole idea of reusing xmin when indexcheckxmin is set is overly clever and storing the xid itself would have be been better... Too late though. In short, all flag changes in pg_index should be done by update-in-place, and the tuple's xmin will never change from the original creating transaction (until frozen). Hm. That doesn't sound that easy to guarantee. Several ALTER TABLE and ALTER INDEX operations are expected to work transactionally right now. As far as I see there is nothing that prohibits a indexcheckxmin requiring index to be altered while indexcheckxmin is still required? What we want the xmin to do, for indcheckxmin purposes, is reflect the time at which the index started to be included in HOT-safety decisions. Since we're never going to move the xmin, that means that *we cannot allow REINDEX to mark a dead index live again*. Once DROP INDEX CONCURRENTLY has reached the final state, you can't revalidate the index by reindexing it, you'll have to drop it and then make a brand new one. That seems like a pretty minor compromise. That would be a regression compared with the current state though. We have officially documented REINDEX as a way out of INVALID indexes... If we store the xid of the reindexing transaction there that might be pessimal (because there should be not HOT safety problems) but should always be correct, or am I missing something? What I propose to do next is create a patch for HEAD that includes a new pg_index flag column, since I think the logic will be clearer with that. Once we're happy with that, we can back-port the patch into a form that uses the existing flag columns. Anybody feel like bikeshedding the flag column name? I'm thinking indislive but maybe there's a better name. I personally would slightly favor indisdead instead... Btw, even if we manage to find a sensible fix for this I would suggest postponing it after the next back branch release. AFAICS this is a data loss/corruption problem, and as such a must fix. If we can't have it done by next week, I'd rather postpone the releases until it is done. Ok, just seemed like a rather complex fix in a short time for something that seemingly hasn't been noticed since 8.3. I am a bit worried about introducing something worse while fixing this. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem Observed in behavior of Create Index Concurrently and Hot Update
BTW, I was thinking that the DROP INDEX CONCURRENTLY logic needed to be: 1. Unset indisvalid, commit, wait out all reading transactions. 2. Unset indisready, commit, wait out all writing transactions. 3. Unset indislive, commit (with parent table relcache flush), wait out all reading-or-writing transactions. 4. Drop the index. However, I wonder whether we couldn't combine steps 2 and 3, ie once there are no readers of the index go directly to the dead state. I don't see a need for a period where the index isn't being inserted into but is still used for HOT-safety decisions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On 2012-11-27 14:08:13 -0500, Tom Lane wrote: BTW, I was thinking that the DROP INDEX CONCURRENTLY logic needed to be: 1. Unset indisvalid, commit, wait out all reading transactions. 2. Unset indisready, commit, wait out all writing transactions. 3. Unset indislive, commit (with parent table relcache flush), wait out all reading-or-writing transactions. 4. Drop the index. However, I wonder whether we couldn't combine steps 2 and 3, ie once there are no readers of the index go directly to the dead state. I don't see a need for a period where the index isn't being inserted into but is still used for HOT-safety decisions. I think you're right, that state isn't interesting for anyone. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 1:45 PM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2012-11-27 at 12:19 -0500, Robert Haas wrote: I admit that there are cases where this could happen, and that it will happen a little more than it does now. But, as you say, this can happen now, and yet we get very few if any complaints about it, whereas we get regular complaints about the need to insert casts that other database systems do not require. The fact is that most functions are not overloaded, so the esoterica of overloading affect only a tiny number of relatively sophisticated users. The need for extra casts cuts a much broader swath through our user base. Well, I did offer a suggestion that would make your idea safer, which is to explicitly opt out of the overloading feature at the time the function is created, rather than making it implicit based on how many functions happen to have the same name. The fact that it can only hurt sophisticated users is not convincing to me. For one thing, our users are programmers, so they should all feel comfortable defining their own functions, and I don't want to make them any less so. Next, sophisticated users also make mistakes. I could also make a security argument. Even today, any user who can create a function in your search path can make your queries start failing. If we locked down most of the system-defined functions as non-overloadable, and allowed users to do the same for their functions (maybe even the default one day?), then that would greatly reduce the exposure. The current strictness of the overloaded functions tends to make users more explicit about argument types, which reduces the chance of problems at the expense of usability and compatibility. Not ideal, but if we make it more permissive then we are permanently stuck with less information about what types the user intended and which function they intended to call. In such an extensible system, that worries me on several fronts. That being said, I'm not outright in opposition to the idea of making improvements like this, I just think we should do so cautiously. Fair enough. I certainly admit that I wouldn't like to release with this code in place and then find out that it's got some critical flaw, security or otherwise. A couple of embarrassing bugs have been found recently in patches I wrote and committed, and I'm not looking to up that number. That having been said, I remain unconvinced that any of the things proposed so far are compelling reasons not to do this. That doesn't mean there aren't any such reasons, but I am personally unconvinced that we've found them yet. Most of the arguments so far advanced seem to involve overloading (where this proposal doesn't change anything vs. today); I think you're the only one who has proposed a situation where it causes a problem (namely, a function that is overloaded later) but in my personal opinion that's not going to happen often enough to justify the amount of user pain the current system imposes. Of course that's a judgement call. I do think that applying some kind of explicit flag to the function indicating whether it should allow implicit assignment casting/implicit casting to text/overloading/whatever is a possibly interesting alternative. It seems clear from our system catalogs that implicit casting to text is sometimes a desirable behavior and sometimes not, so it's reasonable to think that perhaps we should put that under user control. What I like about my proposal (really Tom's idea) is that it seems like it solves a pretty high percentage of the problem cases without requiring any explicit user action. I actually suspect we could get the right behavior even more often by attaching flags to the function or argument position, but that would also put more of the onus on the user to get the flags right -- and we might not even agree amongst ourselves on how the flags should be set. The fact that quote_literal() allows (by the expedient of overloading) implicit casts to text and that lpad() does not seems fairly random to me in hindsight; is there a general principle there that we'd all sign on to? The nice thing about this proposal is that it doesn't require any explicit user action. Of course that's no help if it does the wrong thing, but since it only fixes cases that are unambiguous and which currently fail, it's hard for me to see how that's a real danger. That doesn't mean there ISN'T a real danger, but I want to make sure that if we don't do this we have a clear and understandable reason, and not just bad memories of the last time we made a change in this area. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
Robert Haas robertmh...@gmail.com writes: But as to your point about the system catalogs, it is true that adding an additional function could reduce the number of cases where things work today. But I think in many cases it would eliminate the need for overloading that we already have, and simplify things for future developers. Right now, quote_literal() allows implicit casts to text by having a second version that takes any anyelement argument; on the other hand, concat() allows implicit casts to text by accepting any rather than text as an argument; and || allows implicit casts to text by defining operators for anynonarray || text, text || anynonarray, and text || text. So we've got three quite different methods to create implicit-cast-to-text behavior in particular cases. That's got developer complexity too, and while this proposal wouldn't do anything about the third case since || actually sometimes has a different meaning, namely array concatenation, the first two wouldn't need overloading any more. They'd just work. Uh, no, not really, and I think that assertion just goes to show that this area is more subtle than you think. quote_literal() for instance presently works for any datatype that has an explicit cast to text. After making the change you propose above, it would only work for types for which the cast was assignment-grade or less. concat() is even looser: as now implemented, it works for *anything at all*, because it relies on datatype output functions not casts to text. I'm dubious that that inconsistency is a good thing, actually, but that's how the committed code is written. Now, some of us might think that backing these conversions down to only allowing assignment-grade casts would be an improvement, in the sense that it would actually make the type system tighter not looser than it is today for these particular functions. But I suspect you wouldn't see it as an improvement, given the position you're arguing from. In fact, I'm afraid that making this change would result in requests to downgrade existing explicit casts to be assignment-only, so that people could be even lazier about not casting function arguments; and that is something up with which I will not put. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On 11/26/2012 12:31 PM, Robert Haas wrote: On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan and...@dunslane.net wrote: I don't understand why you would want to create such a cast. If the cast doesn't exist it will do exactly what it does now, i.e. use the type's output function and then json quote and escape it, which in the case of citext is the Right Thing (tm): andrew=# select to_json('foobar'::citext); to_json foo\bar I'm not sure either, but setting up a system where seemingly innocuous actions can in fact have surprising and not-easily-fixable consequences in other parts of the system doesn't seem like good design to me. Of course, maybe I'm misunderstanding what will happen; I haven't actually tested it myself. I'm all for caution, but the argument here seems a bit nebulous. We could create a sort of auxiliary type, as has been previously suggested, that would be in all respects the same as the json type except that it would be the target of the casts that would be used in to_json() and friends. But, that's a darned ugly thing to have to do, so I'd want a good concrete reason for doing it. Right now I'm having a hard time envisioning a problem that could be caused by just using the straightforward solution that's in my latest patch. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 13:45:08 -0500, Tom Lane wrote: In short, all flag changes in pg_index should be done by update-in-place, and the tuple's xmin will never change from the original creating transaction (until frozen). Hm. That doesn't sound that easy to guarantee. Several ALTER TABLE and ALTER INDEX operations are expected to work transactionally right now. As far as I see there is nothing that prohibits a indexcheckxmin requiring index to be altered while indexcheckxmin is still required? I said in pg_index. There is no reason to ever alter an index's pg_index entry transactionally, because we don't support redefining the index columns. The stuff you are allowed to ALTER is pretty much irrelevant to the index's life as an index. What we want the xmin to do, for indcheckxmin purposes, is reflect the time at which the index started to be included in HOT-safety decisions. Since we're never going to move the xmin, that means that *we cannot allow REINDEX to mark a dead index live again*. That would be a regression compared with the current state though. We have officially documented REINDEX as a way out of INVALID indexes... It's a way out of failed CREATE operations. If DROP fails at the last step, you won't be able to go back, but why would you want to? Just do the DROP again. Anybody feel like bikeshedding the flag column name? I'm thinking indislive but maybe there's a better name. I personally would slightly favor indisdead instead... Meh ... the other two flags are positive, in the sense of true-is-the-normal-state, so I thought this one should be too. I had also toyed with indishot, to reflect the idea that this controls whether the index is included in HOT-safety decisions, but that seems maybe a little too cute. Btw, even if we manage to find a sensible fix for this I would suggest postponing it after the next back branch release. AFAICS this is a data loss/corruption problem, and as such a must fix. If we can't have it done by next week, I'd rather postpone the releases until it is done. Ok, just seemed like a rather complex fix in a short time for something that seemingly hasn't been noticed since 8.3. I am a bit worried about introducing something worse while fixing this. Hm? The fact that the DROP patch broke CREATE INDEX CONCURRENTLY is a new and very nasty bug in 9.2. I would agree with you if we were considering the unsafe-row-update problem alone, but it seems like we might as well fix both aspects while we're looking at this code. There is a question of whether we should risk trying to back-patch the in-place-update changes further than 9.2. Given the lack of complaints I'm inclined not to, but could be persuaded differently. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql_check_function - rebase for 9.3
Hello I am sending a updated version - now it is prepared for event triggers and it is little bit more robust I run pgindent, but I have no experience with it, so I am not sure about success Regards Pavel Stehule 2012/10/7 Selena Deckelmann sel...@chesnok.com: Hi! On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I am sending lightly refreshed patch for checking plpgsql functions.. I checked different implementation, but without success: a) enhancing of SPI to some fake mode can has negative impact on application, and patch was not clear, b) generic plpgsql walker doesn't save lines too. I invite any ideas how to improve this patch I reviewed this and did a clean up for bitrot and a little whitespace. In particular, it needed to learn a little about event triggers. This patch is a follow on from an earlier review thread I found: http://archives.postgresql.org/message-id/d960cb61b694cf459dcfb4b0128514c2072df...@exadv11.host.magwien.gv.at I dug through that thread a bit, and I believe issues raised by Laurenz, Petr and Alvaro were resolved by Pavel over time. All tests pass, and after a read-through, the code seems fine. This also represents my inaugural use of pg_bsd_indent. I ran it on pl_check.c - which made things mostly better. Happy to try and fix it up more if someone can explain to me what (if anything) I did incorrectly when using it. -selena -- http://chesnok.com plpgsql_check_function_20121127_01.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-27 14:41:34 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 13:45:08 -0500, Tom Lane wrote: In short, all flag changes in pg_index should be done by update-in-place, and the tuple's xmin will never change from the original creating transaction (until frozen). Hm. That doesn't sound that easy to guarantee. Several ALTER TABLE and ALTER INDEX operations are expected to work transactionally right now. As far as I see there is nothing that prohibits a indexcheckxmin requiring index to be altered while indexcheckxmin is still required? I said in pg_index. There is no reason to ever alter an index's pg_index entry transactionally, because we don't support redefining the index columns. The stuff you are allowed to ALTER is pretty much irrelevant to the index's life as an index. Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY ... USING someindex ; is done? Also I think indoption might be written to as well. What we want the xmin to do, for indcheckxmin purposes, is reflect the time at which the index started to be included in HOT-safety decisions. Since we're never going to move the xmin, that means that *we cannot allow REINDEX to mark a dead index live again*. That would be a regression compared with the current state though. We have officially documented REINDEX as a way out of INVALID indexes... It's a way out of failed CREATE operations. If DROP fails at the last step, you won't be able to go back, but why would you want to? Just do the DROP again. Oh, sorry, misunderstood you. Anybody feel like bikeshedding the flag column name? I'm thinking indislive but maybe there's a better name. I personally would slightly favor indisdead instead... Meh ... the other two flags are positive, in the sense of true-is-the-normal-state, so I thought this one should be too. Good point. I had also toyed with indishot, to reflect the idea that this controls whether the index is included in HOT-safety decisions, but that seems maybe a little too cute. indislive seems better than that, yes. Btw, even if we manage to find a sensible fix for this I would suggest postponing it after the next back branch release. AFAICS this is a data loss/corruption problem, and as such a must fix. If we can't have it done by next week, I'd rather postpone the releases until it is done. Ok, just seemed like a rather complex fix in a short time for something that seemingly hasn't been noticed since 8.3. I am a bit worried about introducing something worse while fixing this. Hm? The fact that the DROP patch broke CREATE INDEX CONCURRENTLY is a new and very nasty bug in 9.2. I would agree with you if we were considering the unsafe-row-update problem alone, but it seems like we might as well fix both aspects while we're looking at this code. There is a question of whether we should risk trying to back-patch the in-place-update changes further than 9.2. Given the lack of complaints I'm inclined not to, but could be persuaded differently. Oh, I only was talking about the inplace changes. The DROP INDEX CONCURRENTLY breakage definitely needs to get backpatched. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, 2012-11-27 at 14:13 -0500, Robert Haas wrote: I do think that applying some kind of explicit flag to the function indicating whether it should allow implicit assignment casting/implicit casting to text/overloading/whatever is a possibly interesting alternative. It seems clear from our system catalogs that implicit casting to text is sometimes a desirable behavior and sometimes not, so it's reasonable to think that perhaps we should put that under user control. What I like about my proposal (really Tom's idea) is that it seems like it solves a pretty high percentage of the problem cases without requiring any explicit user action. What user action are you concerned about? If we (eventually) made the non-overloaded case the default, would that resolve your concerns? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
Robert Haas robertmh...@gmail.com writes: I do think that applying some kind of explicit flag to the function indicating whether it should allow implicit assignment casting/implicit casting to text/overloading/whatever is a possibly interesting alternative. That idea seems possibly worth pursuing. The thing that I find scary about the current proposal is that it applies to all functions (and operators) willy-nilly, which seems to raise the risk of unexpected side effects pretty high. If we could confine the behavioral change to a relatively small number of functions for which there was consensus that they should accept most anything, I'd feel better about it. (Of course, we might then conclude that something close to the quote_literal solution would work as well as a new function property. But it's worth thinking about.) The fact that quote_literal() allows (by the expedient of overloading) implicit casts to text and that lpad() does not seems fairly random to me in hindsight; is there a general principle there that we'd all sign on to? I don't find that random in the slightest. The entire purpose of quote_literal is manufacture a SQL-literal string representation of this value, and that clearly might apply to data of any type. lpad() is, first last and only, a textual operation. Somebody who thinks it should apply directly to an integer is guilty of sloppy thinking at best, or not even understanding what a data type is at worst. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
* Tom Lane (t...@sss.pgh.pa.us) wrote: It also affects developers, in that adding a new overloaded version of a system function (that previously wasn't overloaded) could actually reduce the number of cases for which the function works without an explicit cast. We have got to be really careful with changing the parser's behavior here, or we're going to break cases that work today. For my 2c- we have to be really careful making changes to the system functions as well as the parser's behavior. If we're worried about users creating overloaded versions of system functions, well, I'd probably suggest a don't do that then kind of approach.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Uh, no, not really, and I think that assertion just goes to show that this area is more subtle than you think. quote_literal() for instance presently works for any datatype that has an explicit cast to text. That doesn't appear to be the behavior I'm seeing: rhaas=# select quote_literal(17); quote_literal --- '17' (1 row) rhaas=# select * from pg_cast where castsource = 'int4'::regtype and casttarget = 'text'::regtype; castsource | casttarget | castfunc | castcontext | castmethod ++--+-+ (0 rows) After making the change you propose above, it would only work for types for which the cast was assignment-grade or less. ...but that's everything, because there's a hardcoded exception in the code that dictates that even if there is no entry in pg_cast, an assignment cast to text exists for every data type. concat() is even looser: as now implemented, it works for *anything at all*, because it relies on datatype output functions not casts to text. I'm dubious that that inconsistency is a good thing, actually, but that's how the committed code is written. I argued at the time that CONCAT should take variadic text rather than variadic any and was roundly ignored on the grounds that the implicit casting to text behavior was what everyone wanted in that particular case. My feeling is that we need to adopt a solution to this problem partly so that people don't keep inventing (even in core code!) one-off, hackish solutions that make certain cases behave completely differently from the general rules. Now, some of us might think that backing these conversions down to only allowing assignment-grade casts would be an improvement, in the sense that it would actually make the type system tighter not looser than it is today for these particular functions. But I suspect you wouldn't see it as an improvement, given the position you're arguing from. Actually, I think it wouldn't matter a bit, because of the exception that says there's an assignment cast to text for everything. In fact, I'm afraid that making this change would result in requests to downgrade existing explicit casts to be assignment-only, so that people could be even lazier about not casting function arguments; and that is something up with which I will not put. While I'm personally not excited about it, it is certainly imaginable that someone might prefer something like text - xml to be an assignment casts rather than an explicit cast. But we've got an easy response to that, which is fine, change it for your database, but we're not changing it in the upstream copy. As a compatibility issue with other databases, it's not really an issue; I can't remember a single complaint about needing an explicit cast from text to xml or integer to boolean or any of the other things that appear in pg_cast with castcontext = 'e'. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 2:59 PM, Jeff Davis pg...@j-davis.com wrote: What user action are you concerned about? If we (eventually) made the non-overloaded case the default, would that resolve your concerns? I can't quite see how a non-overloaded flag would work, unless we get rid of schemas. But I think there are a variety of other kinds of labeling that I think would work. I'm still not sure that's as good as a general solution, because if nothing else it relies on us to make the right decision as to which type to use in each case, and considering that neither Tom nor I are particularly sold on what we did with CONCAT(), nor am I sure that we even agree with each other on what the right thing to do would have been there, I'm a bit skeptical about our ability to get these decisions right. But it might still be an improvement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
Robert Haas robertmh...@gmail.com writes: On Tue, Nov 27, 2012 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: After making the change you propose above, it would only work for types for which the cast was assignment-grade or less. ...but that's everything, because there's a hardcoded exception in the code that dictates that even if there is no entry in pg_cast, an assignment cast to text exists for every data type. Ugh. I had been thinking that automatic CoerceViaIO casting only happened for explicit casts. If that can be invoked via assignment casts, then what you're proposing really destroys the type system entirely, at least for functions taking text: there is absolutely no argument such a function won't accept. I cannot support this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why can't plpgsql return a row-expression?
Asif Rehman escribió: Hi, I have tried to solve this issue. Please see the attached patch. With this patch, any expression is allowed in the return statement. For any invalid expression an error is generated without doing any special handling. When a row expression is used in the return statement then the resultant tuple will have rowtype in a single column that needed to be extracted. Hence I have handled that case in exec_stmt_return(). any comments/suggestions? Hmm. We're running an I/O cast during do_tup_convert() now, and look up the required functions for each tuple. I think if we're going to support this operation at that level, we need to look up the necessary functions at convert_tuples_by_foo level, and then apply unconditionally if they've been set up. Also, what are the implicancies for existing users of tupconvert? Do we want to apply casting during ANALYZE for example? AFAICS the patch shouldn't break any case that works today, but I don't see that there has been any analysis of this. (I looked at the patch posted in the thread started by Pavel elsewhere. I'm replying to both emails in the interest of keeping things properly linked.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 3:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I do think that applying some kind of explicit flag to the function indicating whether it should allow implicit assignment casting/implicit casting to text/overloading/whatever is a possibly interesting alternative. That idea seems possibly worth pursuing. The thing that I find scary about the current proposal is that it applies to all functions (and operators) willy-nilly, which seems to raise the risk of unexpected side effects pretty high. If we could confine the behavioral change to a relatively small number of functions for which there was consensus that they should accept most anything, I'd feel better about it. (Of course, we might then conclude that something close to the quote_literal solution would work as well as a new function property. But it's worth thinking about.) The fact that quote_literal() allows (by the expedient of overloading) implicit casts to text and that lpad() does not seems fairly random to me in hindsight; is there a general principle there that we'd all sign on to? I don't find that random in the slightest. The entire purpose of quote_literal is manufacture a SQL-literal string representation of this value, and that clearly might apply to data of any type. lpad() is, first last and only, a textual operation. Somebody who thinks it should apply directly to an integer is guilty of sloppy thinking at best, or not even understanding what a data type is at worst. Well, considering I made that mistake while working with PostgreSQL 8.2, and considering further that other databases allow it, I'm a little reluctant to accept this theory. I'm willing to bet that the fine folks in Redwood understand what a data type is just fine, and I'm pretty sure that I do, too. Sloppy thinking? Perhaps. But I think you could make a perfectly fine argument that the function of lpad() is to concatenate something onto the string representation of a value, or conversely that the function of quote_literal() is to escape a string. You might not agree with either of those arguments but I don't care to label someone who does as an idiot. The problem I have with the explicit labeling approach is that it seems to depend heavily on how you conceptualize what the function is trying to do, and not everyone is going to conceptualize that the same way. Clearly there are a lot of people who expect at least some string operators to work on numbers, including the OP, and are confused when they don't. We can call those people nasty names but that's not going to win us many friends. Anyway, I'm not averse to thinking about some kind of labeling solution but I'm not exactly sure what would work well - and I'd still like to see some hard evidence that the collateral damage from my er your proposal is unacceptably high. The most plausible scenario for how this could break things that has been presented thus far is that someone might create a function, use it with a data type that requires assignment-casting, and then create another function, and have things break. But as Jeff pointed out, that can happen already: in fact, it did, in core, with pg_size_pretty(), and while you had doubts about that change at the time, none of us realized exactly what the failure scenario was until it was too late to change it. Would that kind of thing happen more often under this proposal? Kind of hard to say, but if it made us think twice before overloading system catalog functions, it might even work out to a net positive. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQconninfo function for libpq
On 11/22/12 6:44 AM, Magnus Hagander wrote: I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL and PG_CONNINFO_PASSWORD (which still make sense to separate), What is the use case for separating them? and then plug in the exclusion of specific parameters in pg_basebackup, in the CreateRecoveryConf() function. pg_basebackup will of course always know what pg_basebackup is doing with these things. Agreed on that in principle. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQconninfo function for libpq
On Tue, Nov 27, 2012 at 10:13 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/22/12 6:44 AM, Magnus Hagander wrote: I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL and PG_CONNINFO_PASSWORD (which still make sense to separate), What is the use case for separating them? Getting a connection string that doesn't contain sensitive information like a password. In order to show it in a dialog box, for example. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 14:41:34 -0500, Tom Lane wrote: The stuff you are allowed to ALTER is pretty much irrelevant to the index's life as an index. Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY ... USING someindex ; is done? Also I think indoption might be written to as well. Ugh, you're right. Somebody wasn't paying attention when those ALTER commands were added. We could probably alleviate the consequences of this by having those operations reset indcheckxmin if the tuple's old xmin is below the GlobalXmin horizon. That's something for later though --- it's not a data corruption issue, it just means that the index might unexpectedly not be used for queries for a little bit after an ALTER. It strikes me that the whole idea of reusing xmin when indexcheckxmin is set is overly clever and storing the xid itself would have be been better... Too late though. Well, the reason for that is documented in README.HOT: if the XID were in an ordinary column there'd be no nice way to get it frozen when it gets too old. As-is, VACUUM takes care of that problem automatically. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQconninfo function for libpq
On 11/27/12 4:20 PM, Magnus Hagander wrote: On Tue, Nov 27, 2012 at 10:13 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/22/12 6:44 AM, Magnus Hagander wrote: I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL and PG_CONNINFO_PASSWORD (which still make sense to separate), What is the use case for separating them? Getting a connection string that doesn't contain sensitive information like a password. In order to show it in a dialog box, for example. There is already the PQconninfoOption.dispchar field for this purpose. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On 11/25/12 6:36 PM, Robert Haas wrote: I think that is true. But for whatever it's worth, and at the risk of beating a horse that seems not to be dead yet in spite of the fact that I feel I've already administered one hell of a beating, the LPAD case is unambiguous, and therefore it is hard to see what sort of programming mistake we are protecting users against. Upstream applications passing wrong data down to the database. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQconninfo function for libpq
Peter Eisentraut wrote: On 11/27/12 4:20 PM, Magnus Hagander wrote: On Tue, Nov 27, 2012 at 10:13 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/22/12 6:44 AM, Magnus Hagander wrote: I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL and PG_CONNINFO_PASSWORD (which still make sense to separate), What is the use case for separating them? Getting a connection string that doesn't contain sensitive information like a password. In order to show it in a dialog box, for example. There is already the PQconninfoOption.dispchar field for this purpose. I had the impression that that field would go away with this patch, but then again it might not be worth the compatibility break. I find the dispchar thingy somewhat unsightly. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On 11/25/12 7:21 PM, Robert Haas wrote: Sure, in theory that is true, but no other RDBMS that I know about feels a need to error out in that situation. I'm skeptical of the contention that we're smarter than everyone else. Well, I think in most programming languages that have typed function prototypes, define foo(string) call foo(55) is an error. I could be convinced otherwise, but that's my current impression. So the principle of rejecting this is not new. If, on the other hand, we want to align more with other RDBMS that apparently allow this, we should look closer into by what rules they do this. If they use assignment casts (that is, the same rules that apply when running INSERT, for example), we could look into using that as well, but then we should do that all the time, and not only as a fallback of some sort. Because that's (a) arbitrary, and (b) causes failures when overloaded functions are added, which shouldn't happen (the existing cases where adding overloaded functions cause an existing function to fail are surely warts that should be removed, not new ones designed in). I wonder what implicit casts would be good for if assignment casts applied for function and operator calls. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
Amit Kapila escribió: Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: - I think when finding the max LSN of single file the utility should consider all relation segments. Would you like to find for all relation related segments: Like 12345, 12345.1 ... 12345.nOr 12345, 12345.1 ... and 12345_vm, 12345.1_vm But how about if user asks for 12345.4, do we need to find all greater than 12345.4? IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory, it is okay to find only for that file. Hmm. I think I'd expect that if I give pg_computemaxlsn a number, it should consider that it is a relfilenode, and so it should get a list of all segments for all forks of it. So if I give 12345 it should get 12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on. However, if what I give it is a path, i.e. it contains a slash, I think it should only consider the specific file mentioned. In that light, I'm not sure that command line options chosen are the best set. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On 11/27/12 12:07 PM, Merlin Moncure wrote: Speaking of polymorphism, why not just implement lpad()'s first argument as 'anyelement'? One of the arguments made here was that lpad(not-text) *should* fail. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-27 16:31:15 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 14:41:34 -0500, Tom Lane wrote: The stuff you are allowed to ALTER is pretty much irrelevant to the index's life as an index. Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY ... USING someindex ; is done? Also I think indoption might be written to as well. Ugh, you're right. Somebody wasn't paying attention when those ALTER commands were added. We could probably alleviate the consequences of this by having those operations reset indcheckxmin if the tuple's old xmin is below the GlobalXmin horizon. That's something for later though --- it's not a data corruption issue, it just means that the index might unexpectedly not be used for queries for a little bit after an ALTER. mark_index_clustered() does the same btw, its not a problem in the CLUSTER ... USING ...; case because that creates a new pg_index entry anyway but in the ALTER TABLE one thats not the case. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 4:09 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/27/12 12:07 PM, Merlin Moncure wrote: Speaking of polymorphism, why not just implement lpad()'s first argument as 'anyelement'? One of the arguments made here was that lpad(not-text) *should* fail. Well, sure. My point is: why do we need to break the casting machinery when we can simply tweak a few of the standard functions on portability grounds? Robert's case on lpad() has merit in the sense it has unambiguous meaning regardless of input type; polymorphic input types were designed to solve *exactly* that problem. SQL portability is a secondary but also important argument. That said, md5() surely needs some type of cast or interpretation of non-text types. ditto to_timestamp(), etc. So messing around with the casting rules is surely the wrong answer. I think if you relaxed the function sigs of a few functions on this page (http://www.postgresql.org/docs/9.2/interactive/functions-string.html), most reported problems would go away. One thing that worries me is introducing ambiguous cases where previously there weren't any though. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, 2012-11-27 at 15:41 -0500, Robert Haas wrote: I can't quite see how a non-overloaded flag would work, unless we get rid of schemas. It may work to pick the first schema in the search path that has any functions by that name, and then choose the overloaded (or not) candidate from among those functions in that one schema. Then, non-overloaded function names would be unique within a schema. If there are multiple functions of the same name in multiple schemas in the search path, it does not make sense to me to lump them all together and choose an overloaded candidate from all of them (although I think that's what we do now). That sounds like a mistake, to me. Do you know of any useful examples of doing that? But I think there are a variety of other kinds of labeling that I think would work. Worth exploring. I'm still not sure that's as good as a general solution, because if nothing else it relies on us to make the right decision as to which type to use in each case, and considering that neither Tom nor I are particularly sold on what we did with CONCAT(), nor am I sure that we even agree with each other on what the right thing to do would have been there, I'm a bit skeptical about our ability to get these decisions right. But it might still be an improvement. I'm not entirely clear on the benefits of a general solution, nor why your solution is more general. You are still categorizing functions into overloaded and non-overloaded, but you are doing so at runtime based on the current contents of the catalog. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
Merlin Moncure mmonc...@gmail.com writes: ... I think if you relaxed the function sigs of a few functions on this page (http://www.postgresql.org/docs/9.2/interactive/functions-string.html), most reported problems would go away. That's an interesting way of approaching it. Do we have any data on exactly which functions people do complain about? One thing that worries me is introducing ambiguous cases where previously there weren't any though. Right, but at least we'd be confining the ambiguity to a small number of function names. Tweaking the casting rules could have a lot of unforeseen consequences. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On Tue, Nov 27, 2012 at 4:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: ... I think if you relaxed the function sigs of a few functions on this page (http://www.postgresql.org/docs/9.2/interactive/functions-string.html), most reported problems would go away. That's an interesting way of approaching it. Do we have any data on exactly which functions people do complain about? After a few minutes of google-fu, lpad seems to top the list (if you don't count operator related issues which I think are mostly coding bugs). see: http://drupal.org/node/1338188. also upper: http://sourceforge.net/tracker/?func=detailaid=3447417group_id=171772atid=859223.. also substring: http://archives.postgresql.org/pgsql-bugs/2008-01/msg1.php note in two of the above cases the bugs were raised through 3rd party issue trackers :/. Interestingly, if you look at popular postgresql-only functions, such as regexp_replace, google seems to indicate there's not much problem there. This, IMNSHO, reinforces Robert's point -- but it seems to mostly bite people porting code, running cross database code bases, or having a strong background in other systems. I found a few non-string cases, especially round(). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On 11/27/12 2:41 PM, Tom Lane wrote: Tweaking the casting rules could have a lot of unforeseen consequences. understatement of the year. IMHO. $0.02 worth etc. -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQconninfo function for libpq
Alvaro Herrera alvhe...@2ndquadrant.com writes: Peter Eisentraut wrote: There is already the PQconninfoOption.dispchar field for this purpose. I had the impression that that field would go away with this patch, but then again it might not be worth the compatibility break. I find the dispchar thingy somewhat unsightly. It is that, without a doubt, but that API has been that way longer than any of us have been working on the code. I'm not excited about changing it just to have an allegedly-cleaner API. And we cannot have the field simply go away, because it's been exposed to applications for lo these many years, and surely some of them are using it. So in practice we'd be maintaining both the old API and the new one. I think we should leave it as-is until there are more reasons to change it than seem to be provided in this thread. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the number of pending entries in GIN index with FASTUPDATE=on
1. This patch applies current git head cleanly, but installation ends up with failure because of the lack of pgstattuple--1.0--1.1.sql which added in Makefile. Added pgstattuple--1.0--1.1.sql. Good. Installation completed and ALTER EXTENSION UPDATE works with that. 2. I feel somewhat uneasy with size for palloc's (it's too long), and BuildTupleFromCString used instead of heap_from_tuple.. But it would lead additional modification for existent simillars. OK. I changed the code as you suggested. Thank you. It looks simpler than the last one. (although the way differs to pgstatindex..) 3. pgstatginidex shows only version, pending_pages, and pendingtuples. Why don't you show the other properties such as entry pages, data pages, entries, and total pages as pgstatindex does? I didn't expose those because they are accurate as of last VACUUM. But if you think they are useful, I don't object to expose them. Ok, my point was the apparent consistency of the functions. I don't have any distinct wish about this. I'll mark this as Ready for Committer. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Mon, Nov 26, 2012 at 05:26:42PM -0500, Bruce Momjian wrote: I have developed the attached proof-of-concept patch to test this idea. Unfortunately, I got poor results: pg_upgrade dump restore dmp|res git dmp/res 1 0.12 0.07 0.13 11.16 13.03 1000 3.80 2.83 5.46 18.78 20.27 2000 5.39 5.65 13.99 26.78 28.54 400016.08 12.40 28.34 41.90 44.03 800032.77 25.70 57.97 78.61 80.09 1600057.67 63.42134.43158.49165.78 32000 131.84176.27302.85380.11389.48 64000 270.37708.30 1004.39 1085.39 1094.70 The last two columns show the patch didn't help at all, and the third column shows it is just executing the pg_dump, then the restore, not in parallel, i.e. column 1 + column 2 ~= column 3. ... I will now test using PRIMARY KEY and custom dump format with pg_restore --jobs to see if I can get parallelism that way. I have some new interesting results (in seconds, test script attached): -Fc --- dump | pg_restore/psql -- - pg_upgrade - dump restore -Fc-Fc|-1 -Fc|-j -Fp-Fp|-1 gitpatch 1 0.140.080.140.160.190.130.15 11.04 13.07 1000 3.083.656.536.605.396.376.54 21.05 22.18 2000 6.066.52 12.15 11.78 10.52 12.89 12.11 31.93 31.65 4000 11.07 14.68 25.12 24.47 22.07 26.77 26.77 56.03 47.03 8000 20.85 32.03 53.68 45.23 45.10 59.20 51.33 104.99 85.19 16000 40.28 88.36 127.63 96.65 106.33 136.68 106.64 221.82 157.36 32000 93.78 274.99 368.54 211.30 294.76 376.36 229.80 544.73 321.19 64000 197.79 1109.22 1336.83 577.83 1117.55 1327.98 567.84 1766.12 763.02 I tested custom format with pg_restore -j and -1, as well as text restore. The winner was pg_dump -Fc | pg_restore -1; even -j could not beat it. (FYI, Andrew Dunstan told me that indexes can be restored in parallel with -j.) That is actually helpful because we can use process parallelism to restore multiple databases at the same time without having to use processes for -j parallelism. Attached is my pg_upgrade patch for this. I am going to polish it up for 9.3 application. A further parallelism would be to allow multiple database to be dump/restored at the same time. I will test for that once this is done. I will work on this next. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + : . traprm export QUIET=$((QUIET + 1)) /rtmp/out export PGOPTIONS=-c synchronous_commit=off for CYCLES in 1 1000 2000 4000 8000 16000 32000 64000 do echo $CYCLES /rtmp/out for DIR in /pgsql/CURRENT do echo $DIR /rtmp/out cd $DIR pginstall cd - # need for +16k pipe sed 's/#max_locks_per_transaction = 64/max_locks_per_transaction = 64000/' /u/pg/data/postgresql.conf pipe sed 's/shared_buffers = 128MB/shared_buffers = 1GB/' /u/pg/data/postgresql.conf pipe sed 's/#work_mem = 1MB/work_mem = 500MB/' /u/pg/data/postgresql.conf pipe sed 's/#maintenance_work_mem = 16MB/maintenance_work_mem = 500MB/' /u/pg/data/postgresql.conf pgrestart sleep 2 echo table creation /rtmp/out newdb test for JOT in $(jot $CYCLES); do echo CREATE TABLE test$JOT (x SERIAL PRIMARY KEY);; done| sql test echo pg_dump creation /rtmp/out /usr/bin/time --output=/rtmp/out --append --format '%e' aspg pg_dump --schema-only --format=custom test $TMP/1 echo pg_dump restore /rtmp/out newdb test /usr/bin/time --output=/rtmp/out --append --format '%e' aspg pg_restore --exit-on-error --dbname=test $TMP/1 echo dump -Fc|restore /rtmp/out newdb test2 /usr/bin/time --output=/rtmp/out --append --format '%e' sh -c aspg pg_dump --schema-only --format=custom test | aspg pg_restore --exit-on-error --dbname=test2 $TMP/1 echo dump -Fc|restore, --single-transaction /rtmp/out newdb test2 /usr/bin/time --output=/rtmp/out --append --format '%e' sh -c aspg pg_dump --schema-only --format=custom test | aspg pg_restore --exit-on-error --single-transaction --dbname=test2 $TMP/1 echo combined dump -Fc/restore -j /rtmp/out newdb
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 16:31:15 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY ... USING someindex ; is done? Also I think indoption might be written to as well. Ugh, you're right. Somebody wasn't paying attention when those ALTER commands were added. On closer look, indoption is never updated --- perhaps you were thinking about pg_class.reloptions. indisprimary, indimmediate, and indisclustered are all subject to post-creation updates though. We could probably alleviate the consequences of this by having those operations reset indcheckxmin if the tuple's old xmin is below the GlobalXmin horizon. That's something for later though --- it's not a data corruption issue, it just means that the index might unexpectedly not be used for queries for a little bit after an ALTER. mark_index_clustered() does the same btw, its not a problem in the CLUSTER ... USING ...; case because that creates a new pg_index entry anyway but in the ALTER TABLE one thats not the case. After further study I think the situation is that (1) The indisvalid/indisready/indislive updates in CREATE/DROP INDEX CONCURRENTLY can, and must, be done in-place since we don't have exclusive lock on the parent table. (2) All the other updates can be transactional because we hold sufficient locks to ensure that nothing bad will happen. The proposed reductions in ALTER TABLE lock strength would break this in several cases, but that's a problem for another day. Attached is a very preliminary draft patch for this. I've not addressed the question of whether we can clear indcheckxmin during transactional updates of pg_index rows, but I think it covers everything else talked about in this thread. regards, tom lane diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c index 6a8a96f60339118f7edb11668c5db23fbf85c211..6172137512f5ee72d39262a394c5bd1bef183e14 100644 *** a/contrib/tcn/tcn.c --- b/contrib/tcn/tcn.c *** triggered_change_notification(PG_FUNCTIO *** 141,148 if (!HeapTupleIsValid(indexTuple)) /* should not happen */ elog(ERROR, cache lookup failed for index %u, indexoid); index = (Form_pg_index) GETSTRUCT(indexTuple); ! /* we're only interested if it is the primary key */ ! if (index-indisprimary) { int numatts = index-indnatts; --- 141,148 if (!HeapTupleIsValid(indexTuple)) /* should not happen */ elog(ERROR, cache lookup failed for index %u, indexoid); index = (Form_pg_index) GETSTRUCT(indexTuple); ! /* we're only interested if it is the primary key and valid */ ! if (index-indisprimary index-indisvalid) { int numatts = index-indnatts; diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f99919093ca0da8c59ee4f4df0643837dfbdb38b..5f270404bf1279ad2ba745ca417ab00b0b5dbb08 100644 *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml *** *** 3480,3486 index is possibly incomplete: it must still be modified by commandINSERT//commandUPDATE/ operations, but it cannot safely be used for queries. If it is unique, the uniqueness property is not !true either. /entry /row --- 3480,3486 index is possibly incomplete: it must still be modified by commandINSERT//commandUPDATE/ operations, but it cannot safely be used for queries. If it is unique, the uniqueness property is not !guaranteed true either. /entry /row *** *** 3508,3513 --- 3508,3523 /row row + entrystructfieldindislive/structfield/entry + entrytypebool/type/entry + entry/entry + entry +If false, the index is in process of being dropped, and should be +ignored for all purposes (including HOT-safety decisions) + /entry + /row + + row entrystructfieldindkey/structfield/entry entrytypeint2vector/type/entry entryliterallink linkend=catalog-pg-attributestructnamepg_attribute/structname/link.attnum/literal/entry diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT index f12cad44e56c363e62fa617497bbedfe1ba8c1fe..4cf3c3a0d4c2db96a57e73e46fdd7463db439f79 100644 *** a/src/backend/access/heap/README.HOT --- b/src/backend/access/heap/README.HOT *** from the index, as well as ensuring that *** 386,391 --- 386,419 rows in a broken HOT chain (the first condition is stronger than the second). Finally, we can mark the index valid for searches. + Note that we do not need to set pg_index.indcheckxmin in this code path, + because we have outwaited any transactions that would need to avoid using + the index. (indcheckxmin is only needed because non-concurrent CREATE + INDEX doesn't want to wait; its stronger lock would create too
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Tue, Nov 27, 2012 at 8:13 PM, Bruce Momjian br...@momjian.us wrote: I have some new interesting results (in seconds, test script attached): -Fc --- dump | pg_restore/psql -- - pg_upgrade - dump restore -Fc-Fc|-1 -Fc|-j -Fp-Fp|-1 git patch 1 0.140.080.140.160.190.130.15 11.04 13.07 1000 3.083.656.536.605.396.376.54 21.05 22.18 2000 6.066.52 12.15 11.78 10.52 12.89 12.11 31.93 31.65 4000 11.07 14.68 25.12 24.47 22.07 26.77 26.77 56.03 47.03 8000 20.85 32.03 53.68 45.23 45.10 59.20 51.33 104.99 85.19 16000 40.28 88.36 127.63 96.65 106.33 136.68 106.64 221.82 157.36 32000 93.78 274.99 368.54 211.30 294.76 376.36 229.80 544.73 321.19 64000 197.79 1109.22 1336.83 577.83 1117.55 1327.98 567.84 1766.12 763.02 I tested custom format with pg_restore -j and -1, as well as text restore. The winner was pg_dump -Fc | pg_restore -1; I don't have the numbers at hand, but if my relcache patch is accepted, then -1 stops being faster. -1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of invoking a different N^2, that one in the stats system. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
On Wednesday, November 28, 2012 3:36 AM Alvaro Herrera wrote: Amit Kapila escribió: Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: - I think when finding the max LSN of single file the utility should consider all relation segments. Would you like to find for all relation related segments: Like 12345, 12345.1 ... 12345.nOr 12345, 12345.1 ... and 12345_vm, 12345.1_vm But how about if user asks for 12345.4, do we need to find all greater than 12345.4? IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory, it is okay to find only for that file. Hmm. I think I'd expect that if I give pg_computemaxlsn a number, it should consider that it is a relfilenode, and so it should get a list of all segments for all forks of it. So if I give 12345 it should get 12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on. Yes, I think this can be done either by having it as new option or internally code can interpret as you suggest. Also, can't we think of it as an extended usecase and implement it on top of base, if this usecase is not an urgent? However, if what I give it is a path, i.e. it contains a slash, I think it should only consider the specific file mentioned. In that light, I'm not sure that command line options chosen are the best set. Yes for sure command line options can be improved/extended based on usecase. Please feel free to give suggestion regarding command line option if you feel current option is not an extendable option. One way is -p should be for file path and may be -r for relfile number. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
2012/11/28 Merlin Moncure mmonc...@gmail.com: On Tue, Nov 27, 2012 at 4:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: ... I think if you relaxed the function sigs of a few functions on this page (http://www.postgresql.org/docs/9.2/interactive/functions-string.html), most reported problems would go away. That's an interesting way of approaching it. Do we have any data on exactly which functions people do complain about? After a few minutes of google-fu, lpad seems to top the list (if you don't count operator related issues which I think are mostly coding bugs). see: http://drupal.org/node/1338188. also upper: http://sourceforge.net/tracker/?func=detailaid=3447417group_id=171772atid=859223.. also substring: http://archives.postgresql.org/pgsql-bugs/2008-01/msg1.php some of these issues are buggy and I am happy, so it doesn't working now. http://archives.postgresql.org/pgsql-bugs/2008-01/msg1.php and these issue can be simply solved by overloading. Pavel note in two of the above cases the bugs were raised through 3rd party issue trackers :/. Interestingly, if you look at popular postgresql-only functions, such as regexp_replace, google seems to indicate there's not much problem there. This, IMNSHO, reinforces Robert's point -- but it seems to mostly bite people porting code, running cross database code bases, or having a strong background in other systems. I found a few non-string cases, especially round(). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQconninfo function for libpq
On Nov 28, 2012 1:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: Peter Eisentraut wrote: There is already the PQconninfoOption.dispchar field for this purpose. I had the impression that that field would go away with this patch, but then again it might not be worth the compatibility break. I find the dispchar thingy somewhat unsightly. It is that, without a doubt, but that API has been that way longer than any of us have been working on the code. I'm not excited about changing it just to have an allegedly-cleaner API. And we cannot have the field simply go away, because it's been exposed to applications for lo these many years, and surely some of them are using it. So in practice we'd be maintaining both the old API and the new one. I think we should leave it as-is until there are more reasons to change it than seem to be provided in this thread. I think removing that would be a really bad idea. I'm not sure anybody is actually relying on it, but it would also change the size of the struct and thus break things for anybody using those functions. If people prefer we remove the password classifier for the new function since it at least partially duplicates that field we can certainly do that, but I think leaving it in allows those who write new code to use a slightly neater api, at pretty much no cost in maintenance for us. /Magnus
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
On Tue, Nov 27, 2012 at 5:52 PM, Amit kapila amit.kap...@huawei.com wrote: Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: Observations and Comments --- - If no option is given to pg_computemaxlsn utility then we get a wrong error message ./pg_computemaxlsn ../data pg_computemaxlsn: file or directory (null) exists In my opinion we should consider the -P (Print max LSN from whole data directory) as default in case no option is specified, or otherwise throw a more descriptive error. Fixed. By considering default option as Whole data directory. - Options -p and -P should be mutually exclusive Fixed. Both options will not be allowed at the same time. - Long options missing for -p and -P Fixed. introduced long options as --path --data-directory. - Help string for -P should be something like print max LSN from whole data directory instead of print max LSN from whole database Fixed, by changing the message as suggested. - Help menu is silent about -V or --version option Fixed. - I think when finding the max LSN of single file the utility should consider all relation segments. IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory, it is okay to find only for that file. - There is no check for extra arguments e.g ./pg_computemaxlsn -P . ../data/ ../data/base/ ../data2/ Fixed. - Extra options gives error. - Multiple -p options also gives error. Eg: ./pg_computemaxlsn -p base/12286/12200 -p base/12286/12201 data - For -p {file | dir} option the utility expects the file path relative to the specified data directory path which makes thing little confusing for example ./pg_computemaxlsn -p data/base/12286/12200 data pg_computemaxlsn: file or directory data/base/12286/12200 does not exists although data/base/12286/12200 is valid file path but we gets file does not exists error As the path of file is relative to data directory, that's why in error it prints the path as data/base/12286/12200. Do you have any suggestion what should be done for this? I think we should expect provided path to be relative to current directory or may consider it to be relative to either one of Data or CWD. Because normally we expect path to be relative to CWD if some program is asking for path in command line. And also tab-complete doesn't works on terminal if path is not absolute or relative to the current directory. To handle this I think we should not change current directory to the data directory and generate the file path (like for postmaster.pid) where required by appending data directory path. And before throwing an error should check if path is valid for either DATA or CWD something like if (lstat(LsnSearchPath, fst) 0) { if (errno == ENOENT) { /* Check if the path is relative to provided data directory */ snprintf(path, MAXPGPATH, %s/%s,DataDir,LsnSearchPath); if (lstat(path, fst) 0) { if (errno == ENOENT) { /* Check if the path is relative to provided data directory */ fprintf(stderr, _(%s: file or directory \%s\ does not exists.\n), progname, LsnSearchPath); } else { fprintf(stderr, _(%s: could not stat file or directory \%s\: %s\n), progname, path, strerror(errno)); } exit(1); } else /* Path is relative to data directory*/ LsnSearchPath = strdup(path); } else { fprintf(stderr, _(%s: could not stat file or directory \%s\: %s\n), progname, LsnSearchPath, strerror(errno)); exit(1); } } - Utility should print the max LSN number in hexadecimal notation and LSN format (logid/segno) for consistency with PG, Fixed and may also print the file name which contains that LSN. It might be confusing to users for the cases where it has to follow link (pg_tblspc) and then print some absolute path which is not related to relative path given by user. Also I am not sure if it will be of any use to user. - When finding max LSN from directory, the utility does not considers the sub directories, which I think could be useful in some cases, like if we want to find the max LSN in tablespace directory. With the current implementation we would require to run the utility multiple times, once for each database. It will consider sub-directories as well. Code Review - Code generally looks good, I have few small points. - In main() function variable maxLSN is defined as uint64, Although XLogRecPtr and uint64 are the same thing, but I think we should use XLogRecPtr instead. Changed as per suggestion. - Just a small thing although will hardly improve anything but still for sake of saving few cycles, I think you should use XLByteLT ( ) instead of XLByteLE ( = ) to find if the previous max LSN is lesser then current. Changed as per suggestion. - '\n' is missing from one the printf in the code :-) fprintf(stderr, _(%s: file
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
On Wed, Nov 28, 2012 at 3:06 AM, Alvaro Herrera alvhe...@2ndquadrant.comwrote: Amit Kapila escribió: Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: - I think when finding the max LSN of single file the utility should consider all relation segments. Would you like to find for all relation related segments: Like 12345, 12345.1 ... 12345.nOr 12345, 12345.1 ... and 12345_vm, 12345.1_vm But how about if user asks for 12345.4, do we need to find all greater than 12345.4? IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory, it is okay to find only for that file. Hmm. I think I'd expect that if I give pg_computemaxlsn a number, it should consider that it is a relfilenode, and so it should get a list of all segments for all forks of it. So if I give 12345 it should get 12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on. However, if what I give it is a path, i.e. it contains a slash, I think it should only consider the specific file mentioned. In that light, I'm not sure that command line options chosen are the best set. +1 -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
[HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory
pg_basebackup is taking backup of extra files other than database related files in side a TABLESPACE directory. Scenario: 1) Create tablespace in existing directory '/opt/tblspc' having some extra files and folders. create tablespace tbs1 location '/opt/tblspc'; 2) Now execute the pg_basebackup command; We can see it will copy the extra files in '/opt/tblspc' directory I think backup should be done only files and folders present inside '/opt/tblspc/PG_*' directory (TABLESPACE_VERSION_DIRECTORY). Not all the files and folders in side '/opt/tblspc.' directory. Is it ok to fix in the following way? In function perform_base_backup, while sending the tablespaces one by one we can send the header for Linkpath/TABLESPACE_VERSION_DIRECTORY as separate header and sendDir for Linkpath/TABLESPACE_VERSION_DIRECTORY as path. Regards, Hari babu.
Re: [HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory
On Wed, Nov 28, 2012 at 3:55 PM, Hari Babu haribabu.ko...@huawei.comwrote: pg_basebackup is taking backup of extra files other than database related files in side a TABLESPACE directory. And why do you have files not related to your PG server inside a folder dedicated to a PG server? -- Michael Paquier http://michael.otacoo.com