Re: [HACKERS] PANIC: could not write to log file
On 23/11/12 17:24:48 mailto:t...@sss.pgh.pa.us wrote : Cyril VELTER cyril.vel...@metadys.com writes: I follow up on my previous message. Just got two more crash today very similar to the first ones : PANIC: could not write to log file 118, segment 237 at offset 2686976, length 5578752: Invalid argument STATEMENT: COMMIT PANIC: could not write to log file 118, segment 227 at offset 9764864, length 57344: Invalid argument STATEMENT: COMMIT Hm, those write lengths seem rather large. What have you got wal_buffers set to? Does it help if you back it off to whatever you were using in the 8.2 installation? (The default back in 8.2 days seems to have been 64kB, if that helps.) I changed wal_buffer to 64kb friday evening. It worked flawlessly this week end (but with very litlle load). I got another PANIC this morning with the same kind of message, but just before there is a FATAL : FATAL: could not create signal handler thread PANIC: could not write to log file 121, segment 168 at offset 9592832, length 8192: Invalid argument Unfortunatly there is no error code in the FATAL message. Best regards, cyril -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch: add (PRE|POST)PROCESSOR options to COPY
I wrote: I'll reimplement the feature using the PROGRAM keyword: COPY TABLE FROM PROGRAM 'command line'; COPY TABLE TO PROGRAM 'command line'; I've reimplemented the feature. Attached is an updated version of the patch. I fixed bugs in the previous version of the patch. Please find attached an updated version of the patch. Thanks, Best regards, Etsuro Fujita copy-popen-20121126.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Plugging fd leaks (was Re: [HACKERS] Switching timeline over streaming replication)
On Friday, November 23, 2012 7:03 PM Heikki Linnakangas On 15.11.2012 17:16, Heikki Linnakangas wrote: On 15.11.2012 16:55, Tom Lane wrote: Heikki Linnakangashlinnakan...@vmware.com writes: This is a fairly general issue, actually. Looking around, I can see at least two similar cases in existing code, with BasicOpenFile, where we will leak file descriptors on error: Um, don't we automatically clean those up during transaction abort? Not the ones allocated with PathNameOpenFile or BasicOpenFile. Files allocated with AllocateFile() and OpenTemporaryFile() are cleaned up at abort. If we don't, we ought to think about that, not about cluttering calling code with certain-to-be-inadequate cleanup in error cases. Agreed. Cleaning up at end-of-xact won't help walsender or other non-backend processes, though, because they don't do transactions. But a top-level ResourceOwner that's reset in the sigsetjmp() cleanup routine would work. This is what I came up with. It adds a new function, OpenFile, that returns a raw file descriptor like BasicOpenFile, but the file descriptor is associated with the current subtransaction and automatically closed at end-of-xact, in the same way that AllocateFile and AllocateDir work. In other words, OpenFile is to open() what AllocateFile is to fopen(). BasicOpenFile is unchanged, it returns a raw fd and it's solely the caller's responsibility to close it, but many of the places that used to call BasicOpenFile now use the safer OpenFile function instead. This patch plugs three existing fd (or virtual fd) leaks: 1. copy_file() - fixed by by using OpenFile instead of BasicOpenFile 2. XLogFileLinit() - fixed by adding close() calls to the error cases. Can't use OpenFile here because the fd is supposed to persist over transaction boundaries. 3. lo_import/lo_export - fixed by using OpenFile instead of PathNameOpenFile. I have gone through the patch and find it okay except for one minor suggestion 1. Can we put below log in OpenFile as well +DO_DB(elog(LOG, CloseFile: Allocated %d, numAllocatedDescs)); One thing I'm not too fond of is the naming. I'm calling the new functions OpenFile and CloseFile. There's some danger of confusion there, as the function to close a virtual file opened with PathNameOpenFile is called FileClose. OpenFile is really the same kind of operation as AllocateFile and AllocateDir, but returns an unbuffered fd. So it would be nice if it was called AllocateSomething, too. But AllocateFile is already taken. And I don't much like the Allocate* naming for these anyway, you really would expect the name to contain open. OpenFileInTrans OpenTransactionAwareFile In anycase OpenFile is also okay. I have one usecase in feature (SQL Command to edit postgresql.conf) very similar to OpenFile/CloseFile, but I want that when CloseFile is called from abort, it should remove(unlink) the file as well and during open it has to retry few times if open is not success. I have following options: 1. Extend OpenFile/CloseFile or PathNameOpenFile 2. Write new functions similar to OpenFile/CloseFile, something like OpenConfLockFile/CloseConfLockFile 3. Use OpenFile/CloseFile and handle my specific case with PG_TRY .. PG_CATCH Any suggestions? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: index support for regexp search
On 25.11.2012 22:55, Alexander Korotkov wrote: On Tue, Nov 20, 2012 at 1:43 PM, Heikki Linnakangashlinnakan...@vmware.com wrote: Glad to see this patch hasn't been totally forgotten. Being able to use indexes for regular expressions would be really cool! Back in January, I asked for some high-level description of how the algorithm works (http://archives.postgresql.** org/message-id/4F187D5C.30701@**enterprisedb.comhttp://archives.postgresql.org/message-id/4f187d5c.30...@enterprisedb.com). That's still sorely needed. Googling around, I found the slides for your presentation on this from PGConf.EU - it would be great to have the information from that presentation included in the patch. New version of patch is attached. The changes are following: 1) A big comment with high-level description of what is going on. 2) Regression tests. 3) Documetation update. 4) Some more refactoring. Great, that top-level comment helped tremendously! I feel enlightened. I fixed some spelling, formatting etc. trivial stuff while reading through the patch, see attached. Below is some feedback on the details: * I don't like the PG_TRY/CATCH trick. It's not generally safe to catch an error, without propagating it further or rolling back the whole (sub)transation. It might work in this case, as you're only suppressing errors with the special sqlcode that are used in the same file, but it nevertheless feels naughty. I believe none of the limits that are being checked are strict; it's OK to exceed the limits somewhat, as long as you terminate the processing in a reasonable time, in case of pathological input. I'd suggest putting an explicit check for the limits somewhere, and not rely on ereport(). Something like this, in the code that recurses: if (trgmCNFA-arcsCount MAX_RESULT_ARCS || hash_get_num_entries(trgmCNFA-states) MAX_RESULT_STATES) { trgmCNFA-overflowed = true; return; } And then check for the overflowed flag at the top level. * This part of the high-level comment was not clear to me: * States of the graph produced in the first stage are marked with keys. Key is a pair * of a prefix and a state of the original automaton. Prefix is a last * characters. So, knowing the prefix is enough to know what is a trigram when we read some new * character. However, we can know single character of prefix or don't know any * characters of it. Each state of resulting graph have an enter key (with that * key we've entered this state) and a set of keys which are reachable without * reading any predictable trigram. The algorithm of processing each state * of resulting graph are so: * 1) Add all keys which achievable without reading of any predictable trigram. * 2) Add outgoing arcs labeled with trigrams. * Step 2 leads to creation of new states and recursively processing them. So, * we use depth-first algorithm. I didn't understand that. Can you elaborate? It might help to work through an example, with some ascii art depicting the graph. * It would be nice to add some comments to TrgmCNFA struct, explaining which fields are valid at which stages. For example, it seems that 'trgms' array is calculated only after building the CNFA, by getTrgmVector() function, while arcsCount is updated on the fly, while recursing in the getState() function. * What is the representation used for the path matrix? Needs a comment. * What do the getColorinfo() and scanColorMap() functions do? What exactly does a color represent? What's the tradeoff in choosing MAX_COLOR_CHARS? - Heikki diff --git a/contrib/pg_trgm/Makefile b/contrib/pg_trgm/Makefile index 64fd69f..8033733 100644 --- a/contrib/pg_trgm/Makefile +++ b/contrib/pg_trgm/Makefile @@ -1,7 +1,7 @@ # contrib/pg_trgm/Makefile MODULE_big = pg_trgm -OBJS = trgm_op.o trgm_gist.o trgm_gin.o +OBJS = trgm_op.o trgm_gist.o trgm_gin.o trgm_regexp.o EXTENSION = pg_trgm DATA = pg_trgm--1.0.sql pg_trgm--unpackaged--1.0.sql diff --git a/contrib/pg_trgm/expected/pg_trgm.out b/contrib/pg_trgm/expected/pg_trgm.out index 81d0ca8..ee0131f 100644 --- a/contrib/pg_trgm/expected/pg_trgm.out +++ b/contrib/pg_trgm/expected/pg_trgm.out @@ -54,7 +54,7 @@ select similarity('wow',' WOW '); (1 row) CREATE TABLE test_trgm(t text); -\copy test_trgm from 'data/trgm.data +\copy test_trgm from 'data/trgm.data' select t,similarity(t,'qwertyu0988') as sml from test_trgm where t % 'qwertyu0988' order by sml desc, t; t | sml -+-- @@ -3515,6 +3515,47 @@ select * from test2 where t ilike 'qua%'; quark (1 row) +select * from test2 where t ~ '[abc]{3}'; + t + + abcdef +(1 row) + +select * from test2 where t ~ 'a[bc]+d'; + t + + abcdef +(1 row) + +select * from test2 where t ~* 'DEF'; + t + + abcdef +(1 row) + +select * from test2 where t ~ 'dEf'; + t +--- +(0 rows) + +select * from test2 where t ~* '^q'; + t +--- + quark +(1 row) + +select * from test2 where
Re: [HACKERS] Materialized views WIP patch
On Sun, Nov 25, 2012 at 7:30 PM, Marko Tiikkaja pgm...@joh.to wrote: As others have pointed out, replacing the contents of a table is something which people have been wanting to do for a long time, and I think having this ability would make this patch a lot better; now it just feels like syntactic sugar. I agree that it's mostly syntactic sugar, but I think we need to have realistic expectations for what is possible in an initial patch. When I committed the first patch for foreign data wrappers, it didn't work at all: it was just syntax support. Tom later committed a follow-on patch that made them work. Similarly, I split the event trigger patch into two halves, one of which added the syntax support and the other of which made them functional: and even with both commits in, I think it's fair to say that event triggers are still in a fairly primitive state. None of those patches were small patches. It's going to take multiple years to get materialized views up to a state where they're really useful to a broad audience in production applications, but I don't think we should sneer at anyone for writing a patch that is just syntactic sugar. As it turns out, adding a whole new object type is a lot of work and generates a big patch even if it doesn't do much just yet. Rejecting such patches on the grounds that they aren't comprehensive enough is, IMHO, extremely unwise; we'll either end up landing even larger patches that are almost impossible to review comprehensively and therefore more likely to break something, or else we'll kill the projects outright and end up with nothing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 11/26/12 2:07 PM, Robert Haas wrote: On Sun, Nov 25, 2012 at 7:30 PM, Marko Tiikkaja pgm...@joh.to wrote: As others have pointed out, replacing the contents of a table is something which people have been wanting to do for a long time, and I think having this ability would make this patch a lot better; now it just feels like syntactic sugar. I agree that it's mostly syntactic sugar, but I think we need to have realistic expectations for what is possible in an initial patch. When I committed the first patch for foreign data wrappers, it didn't work at all: it was just syntax support. Tom later committed a follow-on patch that made them work. Similarly, I split the event trigger patch into two halves, one of which added the syntax support and the other of which made them functional: and even with both commits in, I think it's fair to say that event triggers are still in a fairly primitive state. None of those patches were small patches. It's going to take multiple years to get materialized views up to a state where they're really useful to a broad audience in production applications, but I don't think we should sneer at anyone for writing a patch that is just syntactic sugar. As it turns out, adding a whole new object type is a lot of work and generates a big patch even if it doesn't do much just yet. Rejecting such patches on the grounds that they aren't comprehensive enough is, IMHO, extremely unwise; we'll either end up landing even larger patches that are almost impossible to review comprehensively and therefore more likely to break something, or else we'll kill the projects outright and end up with nothing. First of all, I have to apologize. Re-reading the email I sent out last night, it does indeed feel a bit harsh and I can understand your reaction. At no point did I mean to belittle Kevin's efforts or the patch itself. I was mostly looking for Kevin's input on how hard it would be to solve the particular problem and whether it would be possible to do so for 9.3. While I feel like the problem I pointed out is a small caveat and should be at least documented for 9.3, I think this patch has merits of its own even if that problem never gets fixed, and I will continue to review this patch. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 26 November 2012 13:07, Robert Haas robertmh...@gmail.com wrote: None of those patches were small patches. It's going to take multiple years to get materialized views up to a state where they're really useful to a broad audience in production applications, but I don't think we should sneer at anyone for writing a patch that is just syntactic sugar. +1. I have a sweet tooth. I don't like it when people criticise patches on the basis of obviously you could achieve the same effect with $CONVOLUTION. Making things simpler is a desirable outcome. Now, that isn't to say that we should disregard everything or even anything else in pursuit of simplicity; just that needing a Ph.D is Postgresology, as you once put it, to do something routine to many is really hard to defend. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Plugging fd leaks (was Re: [HACKERS] Switching timeline over streaming replication)
On 26.11.2012 14:53, Amit Kapila wrote: I have one usecase in feature (SQL Command to edit postgresql.conf) very similar to OpenFile/CloseFile, but I want that when CloseFile is called from abort, it should remove(unlink) the file as well and during open it has to retry few times if open is not success. I have following options: 1. Extend OpenFile/CloseFile or PathNameOpenFile 2. Write new functions similar to OpenFile/CloseFile, something like OpenConfLockFile/CloseConfLockFile 3. Use OpenFile/CloseFile and handle my specific case with PG_TRY .. PG_CATCH Any suggestions? Hmm, if it's just for locking purposes, how about using a lwlock or a heavy-weight lock instead? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: Hi Amit I have reviewed and tested the patch, Following are my observations and comments. Thank you for the review. I need some clarification regarding some of the comments Observations and Comments --- - I think when finding the max LSN of single file the utility should consider all relation segments. Would you like to find for all relation related segments: Like 12345, 12345.1 ... 12345.nOr 12345, 12345.1 ... and 12345_vm, 12345.1_vm But how about if user asks for 12345.4, do we need to find all greater than 12345.4? IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory, it is okay to find only for that file. - For -p {file | dir} option the utility expects the file path relative to the specified data directory path which makes thing little confusing for example ./pg_computemaxlsn -p data/base/12286/12200 data pg_computemaxlsn: file or directory data/base/12286/12200 does not exists although data/base/12286/12200 is valid file path but we gets file does not exists error Is changing path to data/data/base/12286/12200 in error message will make things more clear? Code Review - - While finding the max LSN from single file, you are not verifying that the file actually exists inside the data directory path provided. and can end up looking at wrong data directory for checking if the server is running. For example: bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 $PWD/otherinstallation/data Maximum LSN found is: 0, WAL segment file name (fileid,seg): - Code is not verifying, if the provided data directory is the valid data directory, This could make pg_computemaxlsn to compute max LSN from the running server file. For example: bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/ Maximum LSN found is: 0, WAL segment file name (fileid,seg): I think both of the above can be addressed if we allow only relative path for file or directory and check if that is relative and below data directory with function path_is_relative_and_below_cwd(). Questions - - I am wondering why are you not following the link inside the pg_tblspc directory to get to the table space location instead of building the directory path. I am thinking if you follow the link instead, The utility can be used across the different versions of PG. This is good suggestion. I shall take care of this in updated patch. With Regards, Amit Kapila. Regards, Muhammad Usama
Re: Plugging fd leaks (was Re: [HACKERS] Switching timeline over streaming replication)
On Monday, November 26, 2012 7:01 PM Heikki Linnakangas wrote: On 26.11.2012 14:53, Amit Kapila wrote: I have one usecase in feature (SQL Command to edit postgresql.conf) very similar to OpenFile/CloseFile, but I want that when CloseFile is called from abort, it should remove(unlink) the file as well and during open it has to retry few times if open is not success. I have following options: 1. Extend OpenFile/CloseFile or PathNameOpenFile 2. Write new functions similar to OpenFile/CloseFile, something like OpenConfLockFile/CloseConfLockFile 3. Use OpenFile/CloseFile and handle my specific case with PG_TRY .. PG_CATCH Any suggestions? Hmm, if it's just for locking purposes, how about using a lwlock or a heavy-weight lock instead? Its not only for lock, the main idea is that we create temp file and write modified configuration in that temp file. In end if it's success, then we rename temp file to .conf file but if it error out then at abort we need to delete temp file. So in short, main point is to close/rename the file in case of success (at end of command) and remove in case of abort. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 11/14/12 9:28 PM, Kevin Grittner wrote: 17. Since the data viewed in an MV is not up-to-date with the latest committed transaction, So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Plugging fd leaks (was Re: [HACKERS] Switching timeline over streaming replication)
Amit Kapila amit.kap...@huawei.com writes: On Monday, November 26, 2012 7:01 PM Heikki Linnakangas wrote: Hmm, if it's just for locking purposes, how about using a lwlock or a heavy-weight lock instead? Its not only for lock, the main idea is that we create temp file and write modified configuration in that temp file. In end if it's success, then we rename temp file to .conf file but if it error out then at abort we need to delete temp file. So in short, main point is to close/rename the file in case of success (at end of command) and remove in case of abort. I'd go with the TRY/CATCH solution. It would be worth extending the fd.c infrastructure if there were multiple users of the feature, but there are not, nor do I see likely new candidates on the horizon. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Mon, Nov 26, 2012 at 8:14 AM, Marko Tiikkaja pgm...@joh.to wrote: First of all, I have to apologize. Re-reading the email I sent out last night, it does indeed feel a bit harsh and I can understand your reaction. At no point did I mean to belittle Kevin's efforts or the patch itself. I was mostly looking for Kevin's input on how hard it would be to solve the particular problem and whether it would be possible to do so for 9.3. While I feel like the problem I pointed out is a small caveat and should be at least documented for 9.3, I think this patch has merits of its own even if that problem never gets fixed, and I will continue to review this patch. OK, no worries. I didn't really interpret your email as belittling; I just want to make sure this feature doesn't get feature-creeped to death. I think everyone, including Kevin, understands that the real-world applicability of v1 is going to be limited and many people will choose alternative techniques rather than relying on this new feature. But I also think that we'll never get to a really awesome, kick-ass feature unless we're willing to commit an initial version that isn't all that awesome or kick-ass. If I understand Kevin's goals correctly, the plan is to get this basic version committed for 9.3, and then to try to expand the capability during the 9.4 release cycle (and maybe 9.5, too, there's a lot of work to do here). I think that's a pretty sound plan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 11/26/2012 09:46 AM, Peter Eisentraut wrote: On 11/14/12 9:28 PM, Kevin Grittner wrote: 17. Since the data viewed in an MV is not up-to-date with the latest committed transaction, So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. If you use Jonathan Gardner's taxonomy at http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views, snapshots are a subclass of materialized views. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Mon, Nov 26, 2012 at 09:46:33AM -0500, Peter Eisentraut wrote: On 11/14/12 9:28 PM, Kevin Grittner wrote: 17. Since the data viewed in an MV is not up-to-date with the latest committed transaction, So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. Snapshot is one of the options for refreshing an Oracle materialized view. There are others, which we'll eventually add if past is any prologue :) I hate to add to the bike-shedding, but we should probably add REFRESH SNAPSHOT as an optional piece of the grammar, with more REFRESH options to come. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
On 11/23/12 9:48 AM, Michael Paquier wrote: We waited a couple of days for feedback for this feature. So based on all the comments provided by everybody on this thread, perhaps we should move on and implement the options that would make pg_ping a better wrapper for PQPing. Comments? Personally, I still don't see the general utility of this. For monitoring, psql -c 'select 1' is much more useful. For network analysis, you can use network analysis tools. The niche for pg_ping in between those is so narrow that I cannot see it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
On Tue, Nov 27, 2012 at 12:26 AM, Peter Eisentraut pete...@gmx.net wrote: On 11/23/12 9:48 AM, Michael Paquier wrote: We waited a couple of days for feedback for this feature. So based on all the comments provided by everybody on this thread, perhaps we should move on and implement the options that would make pg_ping a better wrapper for PQPing. Comments? Personally, I still don't see the general utility of this. For monitoring, psql -c 'select 1' is much more useful. For network analysis, you can use network analysis tools. The niche for pg_ping in between those is so narrow that I cannot see it. As a wrapper for PQPing, you can get different server status specific to libpq which are PQPING_OK, PQPING_REJECT and PQPING_NO_RESPONSE, and perhaps more in the future if PQPing is extended in a way or another. So the purpose of this feature is to allow users to put there hands on a core feature that would allow them to get a libpq-specific server status, and to check the accessibility to the server with something lighter than a psql client connection. Any additional comments Phil? -- Michael Paquier http://michael.otacoo.com
[HACKERS] Upcoming back-branch releases
We're about due for a new set of back-branch update releases. After some discussion among the packagers, it seems the best window for getting this done before the holiday season sets in is next week. Also, as previously mentioned, we're experimenting with weekday rather than over-the-weekend release windows, since this is much more convenient for Dave Page and his crew at EDB. The current plan is to wrap tarballs on Monday Dec 3 for public announcement Thursday Dec 6. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On Wed, Nov 21, 2012 at 3:16 PM, Andrew Dunstan and...@dunslane.net wrote: Non-builtin types are now searched for a cast to json, and if it exists it is used instead of the type's text representation. I didn't add a special type to look for a cast to, as was discussed before, as it seemed a bit funky and unnecessary. It can easily be added, but I'm still not convinced it's a good idea. Note that this is only done for types that aren't builtin - we know how to turn all of those into json without needing to look for a cast. The place where I fear this will cause problems is with non-core text-like datatypes, such as citext. For example, I wonder if creating a cast from citext to json - which seems like a sensible thing to want to do for other reasons - changes the semantics of this function when applied to citext objects. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] change in LOCK behavior
Simon Riggs si...@2ndquadrant.com writes: On 11 October 2012 20:43, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: So we have to take the snapshot before you begin execution, but it seems that to avoid surprising behavior we also have to take it after acquiring locks. And it looks like locking is intertwined with a bunch of other parse analysis tasks that might require a snapshot to have been taken first. Whee. Yeah. I think that a good solution to this would involve guaranteeing that the execution snapshot is not taken until we have all locks that are going to be taken on the tables. Which is likely to involve a fair amount of refactoring, though I admit I've not looked at details. In any case, it's a mistake to think about this in isolation. If we're going to do something about redefining SnapshotNow to avoid its race conditions, that's going to move the goalposts quite a lot. Anyway, my feeling about it is that I don't want 9.2 to have an intermediate behavior between the historical one and whatever we end up designing to satisfy these concerns. That's why I'm pressing for reversion and not a band-aid fix in 9.2. I certainly hope we can do better going forward, but this is not looking like whatever we come up with would be sane to back-patch. Agreed, please revert. We have to do something about this one way or another before we can ship 9.2.2. Is the consensus to revert this patch: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d573e239f03506920938bf0be56c868d9c3416da and if so, who's going to do the deed? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Duplicated oids between tables - problem or not?
I noticed after a pg_upgrade on a system, that the same oid is used both for a database and a user (repeated many times for different combinations of databases and users). This is because pg_upgrade doesn't preserve the database oid, and it reused the oid of the row in pg_authid. The reason I noticed this was that it confuses the hell out of pgadmin. This is clearly a bug in a SQL query pgadmin uses, and I'm going to go fix that. But is this something that can cause us problems somewhere else in the system? ISTM the same thing could happen after oid wraparound, but that pg_upgrade makes it a lot more likely to happen. So I'm thinking it shouldn't be a problem since the oid's are in different tables, but are there any other parts of the system where this could cause an actual problem? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] change in LOCK behavior
On Mon, Nov 26, 2012 at 10:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On 11 October 2012 20:43, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: So we have to take the snapshot before you begin execution, but it seems that to avoid surprising behavior we also have to take it after acquiring locks. And it looks like locking is intertwined with a bunch of other parse analysis tasks that might require a snapshot to have been taken first. Whee. Yeah. I think that a good solution to this would involve guaranteeing that the execution snapshot is not taken until we have all locks that are going to be taken on the tables. Which is likely to involve a fair amount of refactoring, though I admit I've not looked at details. In any case, it's a mistake to think about this in isolation. If we're going to do something about redefining SnapshotNow to avoid its race conditions, that's going to move the goalposts quite a lot. Anyway, my feeling about it is that I don't want 9.2 to have an intermediate behavior between the historical one and whatever we end up designing to satisfy these concerns. That's why I'm pressing for reversion and not a band-aid fix in 9.2. I certainly hope we can do better going forward, but this is not looking like whatever we come up with would be sane to back-patch. Agreed, please revert. We have to do something about this one way or another before we can ship 9.2.2. Is the consensus to revert this patch: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d573e239f03506920938bf0be56c868d9c3416da and if so, who's going to do the deed? I was assuming you were going to do it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 26 November 2012 15:24, David Fetter da...@fetter.org wrote: I hate to add to the bike-shedding, but we should probably add REFRESH SNAPSHOT as an optional piece of the grammar, with more REFRESH options to come. I don't know that they should be called materialised views, but do we really need to overload the word snapshot? I'd just as soon invent a new word as use the Oracle one, since I don't think the term snapshot is widely recognised as referring to anything other than snapshot isolation. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicated oids between tables - problem or not?
Magnus Hagander mag...@hagander.net writes: I noticed after a pg_upgrade on a system, that the same oid is used both for a database and a user (repeated many times for different combinations of databases and users). This is because pg_upgrade doesn't preserve the database oid, and it reused the oid of the row in pg_authid. The reason I noticed this was that it confuses the hell out of pgadmin. This is clearly a bug in a SQL query pgadmin uses, and I'm going to go fix that. But is this something that can cause us problems somewhere else in the system? ISTM the same thing could happen after oid wraparound, but that pg_upgrade makes it a lot more likely to happen. So I'm thinking it shouldn't be a problem since the oid's are in different tables, but are there any other parts of the system where this could cause an actual problem? It should not. It's been understood for many years that uniqueness of OIDs is only guaranteed within individual catalogs (by means of their unique indexes on OID). That's why mechanisms like pg_depend and pg_description use catalog OID + object OID to identify objects. We do insist that hand-assigned OIDs be globally unique, but that's just for maintainers' sanity not because the code depends on it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Mon, Nov 26, 2012 at 04:02:17PM +, Peter Geoghegan wrote: On 26 November 2012 15:24, David Fetter da...@fetter.org wrote: I hate to add to the bike-shedding, but we should probably add REFRESH SNAPSHOT as an optional piece of the grammar, with more REFRESH options to come. I don't know that they should be called materialised views, but do we really need to overload the word snapshot? I'd just as soon invent a new word as use the Oracle one, since I don't think the term snapshot is widely recognised as referring to anything other than snapshot isolation. I believe that the meaning here is unambiguous, and is used in other descriptions than Oracle's, including the one on our wiki. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Upcoming back-branch releases
On Mon, Nov 26, 2012 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're about due for a new set of back-branch update releases. After some discussion among the packagers, it seems the best window for getting this done before the holiday season sets in is next week. Also, as previously mentioned, we're experimenting with weekday rather than over-the-weekend release windows, since this is much more convenient for Dave Page and his crew at EDB. The current plan is to wrap tarballs on Monday Dec 3 for public announcement Thursday Dec 6. There is one open bug reported by Amit Kapila that I feel is important. I'd investigated this and Simon had volunteered to fix the same. http://archives.postgresql.org/pgsql-hackers/2012-10/msg01584.php I'm not sure about the project policies about fixing known bugs before minor releases, but AFAICS this one can lead to corrupt indexes and as a consequence return wrong query results (as shown in the case by Amit). So IMHO we should fix this. Thanks, Pavan
Re: [HACKERS] Duplicated oids between tables - problem or not?
On Mon, Nov 26, 2012 at 5:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: I noticed after a pg_upgrade on a system, that the same oid is used both for a database and a user (repeated many times for different combinations of databases and users). This is because pg_upgrade doesn't preserve the database oid, and it reused the oid of the row in pg_authid. The reason I noticed this was that it confuses the hell out of pgadmin. This is clearly a bug in a SQL query pgadmin uses, and I'm going to go fix that. But is this something that can cause us problems somewhere else in the system? ISTM the same thing could happen after oid wraparound, but that pg_upgrade makes it a lot more likely to happen. So I'm thinking it shouldn't be a problem since the oid's are in different tables, but are there any other parts of the system where this could cause an actual problem? It should not. It's been understood for many years that uniqueness of OIDs is only guaranteed within individual catalogs (by means of their unique indexes on OID). That's why mechanisms like pg_depend and pg_description use catalog OID + object OID to identify objects. That's what I figured. The problem with pgadmin is it joins to pg_shdescription without restricting it on the catalog oid. We do insist that hand-assigned OIDs be globally unique, but that's just for maintainers' sanity not because the code depends on it. Gotcha. Thanks! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] change in LOCK behavior
Robert Haas robertmh...@gmail.com writes: On Mon, Nov 26, 2012 at 10:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: We have to do something about this one way or another before we can ship 9.2.2. Is the consensus to revert this patch: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d573e239f03506920938bf0be56c868d9c3416da and if so, who's going to do the deed? I was assuming you were going to do it. OK, will take care of it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Fri, Nov 23, 2012 at 5:34 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Thu, Nov 15, 2012 at 7:05 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, Nov 14, 2012 at 11:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Janes jeff.ja...@gmail.com writes: On Thu, Nov 8, 2012 at 9:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: There are at least three ways we could whack that mole: ... * Keep a separate list (or data structure of your choice) so that relcache entries created in the current xact could be found directly rather than having to scan the whole relcache. That'd add complexity though, and could perhaps be a net loss for cases where the relcache isn't so bloated. Maybe a static list that can overflow, like the ResourceOwner/Lock table one recently added. The overhead of that should be very low. Are the three places where need_eoxact_work = true; the only places where things need to be added to the new structure? Yeah. The problem is not so much the number of places that do that, as that places that flush entries from the relcache would need to know to remove them from the separate list, else you'd have dangling pointers. If the list is of hash-tags rather than pointers, all we would have to do is ignore entries that are not still in the hash table, right? I've attached a proof-of-concept patch to implement this. I got rid of need_eoxact_work entirely and replaced it with a short list that fulfills the functions of indicating that work is needed, and suggesting which rels might need that work. There is no attempt to prevent duplicates, nor to remove invalidated entries from the list. Invalid entries are skipped when the hash entry is not found, and processing is idempotent so duplicates are not a problem. Formally speaking, if MAX_EOXACT_LIST were 0, so that the list overflowed the first time it was accessed, then it would be identical to the current behavior or having only a flag. So formally all I did was increase the max from 0 to 10. I wasn't so sure about the idempotent nature of Sub transaction processing, so I chickened out and left that part alone. I know of no workflow for which that was a bottleneck. AtEOXact_release is oddly indented because that makes the patch smaller and easier to read. This makes the non -1 restore of large dumps very much faster (and makes them faster than -1 restores, as well) I added a create temp table foo (x integer) on commit drop; line to the default pgbench transaction and tested that. I was hoping to see a performance improvement there was well (the transaction has ~110 entries in the RelationIdCache at eoxact each time), but the performance was too variable (probably due to the intense IO it causes) to detect any changes. At least it is not noticeably slower. If I hack pgbench to bloat the RelationIdCache by touching 20,000 useless tables as part of the connection start up process, then this patch does show a win. It is not obvious what value to set the MAX list size to. Since this array is only allocated once per back-end, and since it not groveled through to invalidate relations at each invalidation, there is no particular reason it must be small. But if the same table is assigned new filenodes (or forced index lists, whatever those are) repeatedly within a transaction, the list could become bloated with replicate entries, potentially becoming even larger than the hash table whose scan it is intended to short-cut. In any event, 10 seems to be large enough to overcome the currently known bottle-neck. Maybe 100 would be a more principled number, as that is about where the list could start to become as big as the basal size of the RelationIdCache table. I don't think this patch replaces having some mechanism for restricting how large RelationIdCache can get or how LRU entries in it can get as Robert suggested. But this approach seems like it is easier to implement and agree upon; and doesn't preclude doing other optimizations later. I haven't reviewed this terribly closely, but I think this is likely worth pursuing. I see you already added it to the next CommitFest, which is good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Upcoming back-branch releases
On 2012-11-26 21:35:05 +0530, Pavan Deolasee wrote: On Mon, Nov 26, 2012 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're about due for a new set of back-branch update releases. After some discussion among the packagers, it seems the best window for getting this done before the holiday season sets in is next week. Also, as previously mentioned, we're experimenting with weekday rather than over-the-weekend release windows, since this is much more convenient for Dave Page and his crew at EDB. The current plan is to wrap tarballs on Monday Dec 3 for public announcement Thursday Dec 6. There is one open bug reported by Amit Kapila that I feel is important. I'd investigated this and Simon had volunteered to fix the same. http://archives.postgresql.org/pgsql-hackers/2012-10/msg01584.php I have submitted a proposed fix for it friday: http://archives.postgresql.org/message-id/20121124155005.GA10299%40awork2.anarazel.de For some reason the web ui only shows one of the the attachements though... They also are in: http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/bugfixes Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 64 bit ints vs libpq-fe.h
It's no longer possible to build pgadmin with libpq from git master: /opt/pgsql/inst-pg/head/include/libpq-fe.h:551:8: error: ‘pg_int64’ does not name a type and related messages about it. This seems to be related to http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=95d035e66d8e4371d35830d81f39face03cd4c45 AFAICT, this suddenly requires that any user of libpq has pg_int64 defined, which isn't likely to happen outside of postgres itself. Or am I reading things wrong? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Upcoming back-branch releases
On Mon, Nov 26, 2012 at 5:21 PM, Andres Freund and...@2ndquadrant.com wrote: On 2012-11-26 21:35:05 +0530, Pavan Deolasee wrote: On Mon, Nov 26, 2012 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're about due for a new set of back-branch update releases. After some discussion among the packagers, it seems the best window for getting this done before the holiday season sets in is next week. Also, as previously mentioned, we're experimenting with weekday rather than over-the-weekend release windows, since this is much more convenient for Dave Page and his crew at EDB. The current plan is to wrap tarballs on Monday Dec 3 for public announcement Thursday Dec 6. There is one open bug reported by Amit Kapila that I feel is important. I'd investigated this and Simon had volunteered to fix the same. http://archives.postgresql.org/pgsql-hackers/2012-10/msg01584.php I have submitted a proposed fix for it friday: http://archives.postgresql.org/message-id/20121124155005.GA10299%40awork2.anarazel.de For some reason the web ui only shows one of the the attachements though... Did you by any chance use git-send-email to send it? That one is known to confuse how majordomo sticks it in the mbox files that are then later used to generate the archives... (Yes, a better way to deal with that is in the works. But it would certainly help to know if that's how the email was created, so we have more datapoints) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Upcoming back-branch releases
On 2012-11-26 17:27:11 +0100, Magnus Hagander wrote: On Mon, Nov 26, 2012 at 5:21 PM, Andres Freund and...@2ndquadrant.com wrote: On 2012-11-26 21:35:05 +0530, Pavan Deolasee wrote: On Mon, Nov 26, 2012 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're about due for a new set of back-branch update releases. After some discussion among the packagers, it seems the best window for getting this done before the holiday season sets in is next week. Also, as previously mentioned, we're experimenting with weekday rather than over-the-weekend release windows, since this is much more convenient for Dave Page and his crew at EDB. The current plan is to wrap tarballs on Monday Dec 3 for public announcement Thursday Dec 6. There is one open bug reported by Amit Kapila that I feel is important. I'd investigated this and Simon had volunteered to fix the same. http://archives.postgresql.org/pgsql-hackers/2012-10/msg01584.php I have submitted a proposed fix for it friday: http://archives.postgresql.org/message-id/20121124155005.GA10299%40awork2.anarazel.de For some reason the web ui only shows one of the the attachements though... Did you by any chance use git-send-email to send it? That one is known to confuse how majordomo sticks it in the mbox files that are then later used to generate the archives... (Yes, a better way to deal with that is in the works. But it would certainly help to know if that's how the email was created, so we have more datapoints) No git-send-email in this case (although I used it in the past and likely will use it again). Mutt instead. The attachements were generated by git format-patch HEAD~2 and then normally attached to the email. format-patch is internally used by send-email though, so that might be a hint... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 64 bit ints vs libpq-fe.h
Magnus Hagander mag...@hagander.net writes: It's no longer possible to build pgadmin with libpq from git master: /opt/pgsql/inst-pg/head/include/libpq-fe.h:551:8: error: pg_int64 does not name a type [ scratches head ... ] That should be typedef'd in postgres_ext.h. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On 11/26/2012 10:55 AM, Robert Haas wrote: On Wed, Nov 21, 2012 at 3:16 PM, Andrew Dunstan and...@dunslane.net wrote: Non-builtin types are now searched for a cast to json, and if it exists it is used instead of the type's text representation. I didn't add a special type to look for a cast to, as was discussed before, as it seemed a bit funky and unnecessary. It can easily be added, but I'm still not convinced it's a good idea. Note that this is only done for types that aren't builtin - we know how to turn all of those into json without needing to look for a cast. The place where I fear this will cause problems is with non-core text-like datatypes, such as citext. For example, I wonder if creating a cast from citext to json - which seems like a sensible thing to want to do for other reasons - changes the semantics of this function when applied to citext objects. I don't understand why you would want to create such a cast. If the cast doesn't exist it will do exactly what it does now, i.e. use the type's output function and then json quote and escape it, which in the case of citext is the Right Thing (tm): andrew=# select to_json('foobar'::citext); to_json foo\bar cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 64 bit ints vs libpq-fe.h
On Mon, Nov 26, 2012 at 5:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: It's no longer possible to build pgadmin with libpq from git master: /opt/pgsql/inst-pg/head/include/libpq-fe.h:551:8: error: ‘pg_int64’ does not name a type [ scratches head ... ] That should be typedef'd in postgres_ext.h. Ha. A wipe+reinstall seems to have fixed it. Must've been something broken in my build dependencies. Sorry about the noise. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan and...@dunslane.net wrote: I don't understand why you would want to create such a cast. If the cast doesn't exist it will do exactly what it does now, i.e. use the type's output function and then json quote and escape it, which in the case of citext is the Right Thing (tm): andrew=# select to_json('foobar'::citext); to_json foo\bar I'm not sure either, but setting up a system where seemingly innocuous actions can in fact have surprising and not-easily-fixable consequences in other parts of the system doesn't seem like good design to me. Of course, maybe I'm misunderstanding what will happen; I haven't actually tested it myself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On Thu, Nov 22, 2012 at 4:54 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Andrew Dunstan and...@dunslane.net writes: Here is a WIP patch for enhancements to json generation. First, there is the much_requested json_agg, which will aggregate rows directly to json. So the following will now work: select json_agg(my_table) from mytable; select json_agg(q) from (myquery here) q; Awesome, thanks! How do you handle the nesting of the source elements? I would expect a variant of the aggregate that takes two input parameters, the datum and the current nesting level. Consider a tree table using parent_id and a recursive query to display the tree. You typically handle the nesting with an accumulator and a call to repeat() to prepend some spaces before the value columns. What about passing that nesting level (integer) to the json_agg()? Here's a worked out example: CREATE TABLE parent_child ( parent_id integer NOT NULL, this_node_id integer NULL ); INSERT INTO parent_child (parent_id, this_node_id) VALUES (0, 1); INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 2); INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3); INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 4); INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5); INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 6); INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 7); INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8); INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 9); INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10); WITH RECURSIVE tree(id, level, parents) AS ( SELECT this_node_id as id, 0 as level, '{}'::int[] as parents FROM parent_child WHERE parent_id = 0 UNION ALL SELECT this_node_id as id, t.level + 1, t.parents || c.parent_id FROM parent_child c JOIN tree t ON t.id = c.parent_id ) SELECT json_agg(id, level) FROM tree; I've left the parents column in the query above as a debug facility, but it's not needed in that case. I don't think there is any way a json_agg() function could reasonably do this. The only possible dataset it could work on would be for homogeneously typed array-like data which is not very interesting for the broader case of nested json productions. I think the right way to do it is to work out the precise structure in sql and do the transformation directly on the type using the already existing transformation functions. That said, recursive structures are a pain presently because postgres composite types are not allowed to be recursive. So ISTM getting that restriction relaxed is the way to go; then you build it in sql and just fire whatever xxx_to_json happens to be the most appropriate...then your example would be a snap. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion for --truncate-tables to pg_restore
On Wed, Nov 21, 2012 at 12:53 AM, Josh Kupershmidt schmi...@gmail.com wrote: TBH, I didn't find the example above particularly compelling for demonstrating the need for this feature. If you've just got one table with dependent views which needs to be restored, it's pretty easy to manually TRUNCATE and have pg_restore --data-only reload the table. (And easy enough to combine the truncate and restore into a single transaction in case anything goes wrong, if need be.) But I'm willing to grant that this proposed feature is potentially as useful as existing restore-jiggering options like --disable-triggers. And I guess I could see that if you're really stuck having to perform a --data-only restore of many tables, this feature could come in handy. I think I would come down on the other side of this. We've never really been able to get --clean work properly in all scenarios, and it seems likely that a similar fate will befall this option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
On Mon, Nov 26, 2012 at 10:26:27AM -0500, Peter Eisentraut wrote: On 11/23/12 9:48 AM, Michael Paquier wrote: We waited a couple of days for feedback for this feature. So based on all the comments provided by everybody on this thread, perhaps we should move on and implement the options that would make pg_ping a better wrapper for PQPing. Comments? Personally, I still don't see the general utility of this. For monitoring, psql -c 'select 1' is much more useful. For network analysis, you can use network analysis tools. The niche for pg_ping in between those is so narrow that I cannot see it. I would normally agree with this analysis, but pg_ctl -w certainly need this ping functionality, so it kind of makes sense that others might need it too. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Upcoming back-branch releases
Andres Freund escribió: On 2012-11-26 17:27:11 +0100, Magnus Hagander wrote: On Mon, Nov 26, 2012 at 5:21 PM, Andres Freund and...@2ndquadrant.com wrote: I have submitted a proposed fix for it friday: http://archives.postgresql.org/message-id/20121124155005.GA10299%40awork2.anarazel.de For some reason the web ui only shows one of the the attachements though... Did you by any chance use git-send-email to send it? That one is known to confuse how majordomo sticks it in the mbox files that are then later used to generate the archives... (Yes, a better way to deal with that is in the works. But it would certainly help to know if that's how the email was created, so we have more datapoints) No git-send-email in this case (although I used it in the past and likely will use it again). Mutt instead. The attachements were generated by git format-patch HEAD~2 and then normally attached to the email. format-patch is internally used by send-email though, so that might be a hint... The problem is that those files start with the infamous ^From header line which Mhonarc is so old-fashioned about, so the patches ended up as separate emails. Note the attachment in the email linked above is empty. You can see the patches here: http://archives.postgresql.org/pgsql-hackers/2012-11/msg01278.php http://archives.postgresql.org/pgsql-hackers/2012-11/msg01279.php -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On 11/21/12 3:16 PM, Andrew Dunstan wrote: One open question regarding this feature is whether this should return NULL or '[]' for 0 rows. Currently it returns NULL but I could be convinced to return '[]', and the change would be very small. Although my intuition would be [], the existing concatenation-like aggregates return null for no input rows, so this probably ought to be consistent with those. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Use of fsync; was Re: [HACKERS] Pg_upgrade speed for many tables
On Sat, Nov 24, 2012 at 09:42:08PM -0800, Jeff Janes wrote: On Fri, Nov 23, 2012 at 7:22 PM, Bruce Momjian br...@momjian.us wrote: On Mon, Nov 19, 2012 at 12:11:26PM -0800, Jeff Janes wrote: Yes, it is with synchronous_commit=off. (or if it wasn't originally, it is now, with the same result) Applying your fsync patch does solve the problem for me on ext4. Having the new cluster be on ext3 rather than ext4 also solves the problem, without the need for a patch; but it would be nice to more friendly to ext4, which is popular even though not recommended. Do you have numbers with synchronous-commit=off, fsync=off, and both, on ext4? for 5,000 tables like create table fooN (x serial), upgrading from 9.3dev to 9.3dev: Timings are in seconds, done twice. I had to hack pg_upgrade so that the pg_ctl stop command did -w -t 3600, otherwise I'd get an database did not shut down error for the first two. both on648.29 608.42 synchronous_commit off 250.24 366.50 fsync off 46.91 43.96 both off 41.44 44.81 Also, I did a manual sync as soon as Removing support functions from new cluster OK appears, with synchronous_commit off bug fsync on: 45.96 46.46 OK, these very convincing numbers. I am going to modify initdb to have an --fsync-only option, and have pg_upgrade use that. This is 9.3 material. In any event, I think the documentation should caution that the upgrade should not be deemed to be a success until after a system-wide sync has been done. Even if we use the link rather than copy method, are we sure that that is safe if the directories recording those links have not been fsynced? OK, the above is something I have been thinking about, and obviously you have too. If you change fsync from off to on in a cluster, and restart it, there is no guarantee that the dirty pages you read from the kernel are actually on disk, because Postgres doesn't know they are dirty. They probably will be pushed to disk by the kernel in less than one minute, but still, it doesn't seem reliable. Should this be documented in the fsync section? Again, another reason not to use fsync=off, though your example of the file copy is a good one. As you stated, this is a problem with the file copy/link, independent of how Postgres handles the files. We can tell people to use 'sync' as root on Unix, but what about Windows? I'm pretty sure someone mentioned the way to do that on Windows in this list in the last few months, but I can't seem to find it. I thought it was the initdb fsync thread. Yep, the code is already in initdb to fsync a directory --- we just need a way for pg_upgrade to access it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On 11/26/2012 08:12 PM, Peter Eisentraut wrote: On 11/21/12 3:16 PM, Andrew Dunstan wrote: One open question regarding this feature is whether this should return NULL or '[]' for 0 rows. Currently it returns NULL but I could be convinced to return '[]', and the change would be very small. Although my intuition would be [], the existing concatenation-like aggregates return null for no input rows, so this probably ought to be consistent with those. In some previous mail Tom Lane claimed that by SQL standard either an array of all NULLs or a record with all fields NULLs (I don't remember which) is also considered NULL. If this is true, then an empty array - which can be said to consist of nothing but NULLs - should itself be NULL. If this is so, than the existing behaviour of returning NULL in such cases is what standard requires. Hannu Krosing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: index support for regexp search
On Mon, Nov 26, 2012 at 4:55 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Great, that top-level comment helped tremendously! I feel enlightened. I fixed some spelling, formatting etc. trivial stuff while reading through the patch, see attached. Below is some feedback on the details: * I don't like the PG_TRY/CATCH trick. It's not generally safe to catch an error, without propagating it further or rolling back the whole (sub)transation. It might work in this case, as you're only suppressing errors with the special sqlcode that are used in the same file, but it nevertheless feels naughty. I believe none of the limits that are being checked are strict; it's OK to exceed the limits somewhat, as long as you terminate the processing in a reasonable time, in case of pathological input. I'd suggest putting an explicit check for the limits somewhere, and not rely on ereport(). Something like this, in the code that recurses: if (trgmCNFA-arcsCount MAX_RESULT_ARCS || hash_get_num_entries(trgmCNFA-**states) MAX_RESULT_STATES) { trgmCNFA-overflowed = true; return; } PG_TRY/CATCH trick is replaced with some number of if/return. Code becomes a bit more complicated, but your notes does matter. And then check for the overflowed flag at the top level. * This part of the high-level comment was not clear to me: * States of the graph produced in the first stage are marked with keys. Key is a pair * of a prefix and a state of the original automaton. Prefix is a last * characters. So, knowing the prefix is enough to know what is a trigram when we read some new * character. However, we can know single character of prefix or don't know any * characters of it. Each state of resulting graph have an enter key (with that * key we've entered this state) and a set of keys which are reachable without * reading any predictable trigram. The algorithm of processing each state * of resulting graph are so: * 1) Add all keys which achievable without reading of any predictable trigram. * 2) Add outgoing arcs labeled with trigrams. * Step 2 leads to creation of new states and recursively processing them. So, * we use depth-first algorithm. I didn't understand that. Can you elaborate? It might help to work through an example, with some ascii art depicting the graph. This comment is extended with example. * It would be nice to add some comments to TrgmCNFA struct, explaining which fields are valid at which stages. For example, it seems that 'trgms' array is calculated only after building the CNFA, by getTrgmVector() function, while arcsCount is updated on the fly, while recursing in the getState() function. TrgmCNFA comment is extended with this. * What is the representation used for the path matrix? Needs a comment. Comments are added to PackedTrgmPaths and TrgmStatePath. * What do the getColorinfo() getColorInfo comment now references to ColorInfo struct which have comment. and scanColorMap() functions do? scanColorMap comment now have description of colormap structure. What exactly does a color represent? This is added to the top comment. What's the tradeoff in choosing MAX_COLOR_CHARS? Comment is added to the macro. -- With best regards, Alexander Korotkov. trgm-regexp-0.6.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On 11/26/2012 02:46 PM, Hannu Krosing wrote: On 11/26/2012 08:12 PM, Peter Eisentraut wrote: On 11/21/12 3:16 PM, Andrew Dunstan wrote: One open question regarding this feature is whether this should return NULL or '[]' for 0 rows. Currently it returns NULL but I could be convinced to return '[]', and the change would be very small. Although my intuition would be [], the existing concatenation-like aggregates return null for no input rows, so this probably ought to be consistent with those. In some previous mail Tom Lane claimed that by SQL standard either an array of all NULLs or a record with all fields NULLs (I don't remember which) is also considered NULL. If this is true, then an empty array - which can be said to consist of nothing but NULLs - should itself be NULL. If this is so, than the existing behaviour of returning NULL in such cases is what standard requires. That would be more relevant if we were talking about postgres arrays, but the '[]' here would not be a postgres array - it would be a piece of json text. But in any case, the consensus seems to be to return null, and on the principle of doing the least work required I'm happy to comply. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements: fk-tree-to-json()
On 11/22/2012 11:54 AM, Dimitri Fontaine wrote: Andrew Dunstan and...@dunslane.net writes: Here is a WIP patch for enhancements to json generation. First, there is the much_requested json_agg, which will aggregate rows directly to json. So the following will now work: select json_agg(my_table) from mytable; select json_agg(q) from (myquery here) q; Awesome, thanks! How do you handle the nesting of the source elements? I would expect a variant of the aggregate that takes two input parameters, the datum and the current nesting level. Consider a tree table using parent_id and a recursive query to display the tree. You typically handle the nesting with an accumulator and a call to repeat() to prepend some spaces before the value columns. What about passing that nesting level (integer) to the json_agg()? It still would not produxe nesting, just a nicer format. If you want real nesting, you may want a version of my pl/python function row-with-all-dependents-by-foreign-key-to-json() which outputs a table row and then recursively all rows from other (or the same) table which have a foreign key relationship to this row I use it to backup removed objects. I would love to have something similar as a built-in function, though the current version has some limitations and lacks some checks, like check for FK loops. Function follows: --- CREATE OR REPLACE FUNCTION record_to_json_with_detail(table_name text, pk_value int) RETURNS text AS $$ import json,re def fk_info(table_name): fkplan = plpy.prepare( SELECT conrelid::regclass as reftable, pg_get_constraintdef(c.oid) as condef FROM pg_constraint c WHERE c.confrelid::regclass = $1::regclass AND c.contype = 'f' , (text,)) cdefrx = re.compile('FOREIGN KEY [(](.*)[)] REFERENCES .*[(](.*)[)].*') fkres = plpy.execute(fkplan, (table_name,)) for row in fkres: reffields, thisfields = cdefrx.match(row['condef']).groups() yield thisfields, row['reftable'],reffields def select_from_table_by_col(table_name, col_name, col_value): qplan = plpy.prepare('select * from %s where %s = $1' % (table_name, col_name), ('int',)) return plpy.execute(qplan, (col_value,)) def recursive_rowdict(table_name, row_dict): rd = dict([(a,b) for (a,b) in row_dict.items() if b is not None]) # skip NULLs rd['__row_class__'] = table_name for id_col, ref_tab, ref_col in fk_info(table_name): q2res = select_from_table_by_col(ref_tab, ref_col,row_dict[id_col]) if q2res: try: rd['__refs__::' + id_col] += [recursive_rowdict(ref_tab,row) for row in q2res] except KeyError: rd['__refs__::' + id_col] = [recursive_rowdict(ref_tab,row) for row in q2res] return rd q1res = select_from_table_by_col(table_name, 'id', pk_value) return json.dumps(recursive_rowdict(table_name, q1res[0]), indent=3) $$ LANGUAGE plpythonu; create table test1(id serial primary key, selfkey int references test1, data text); create table test2(id serial primary key, test1key int references test1, data text); insert into test1 values(1,null,'top'); insert into test1 values(2,1,'lvl1'); insert into test2 values(1,1,'lvl1-2'); insert into test2 values(2,2,'lvl2-2'); select record_to_json_with_detail('test1',1); record_to_json_with_detail --- { __row_class__: test1, data: top, id: 1, __refs__::id: [ { __row_class__: test1, selfkey: 1, data: lvl1, id: 2, __refs__::id: [ { __row_class__: test2, test1key: 2, data: lvl2-2, id: 2 } ] }, { __row_class__: test2, test1key: 1, data: lvl1-2, id: 1 } ] } (1 row) Time: 6.576 ms --- Hannu Krosing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
Hannu Krosing ha...@2ndquadrant.com writes: In some previous mail Tom Lane claimed that by SQL standard either an array of all NULLs or a record with all fields NULLs (I don't remember which) is also considered NULL. If this is true, then an empty array - which can be said to consist of nothing but NULLs - should itself be NULL. What I think you're referring to is that the spec says that foo IS NULL should return true if foo is a record containing only null fields. That's a fairly narrow statement. It does NOT say that NULL and (NULL,NULL,...) are indistinguishable for all purposes; only that this particular test doesn't distinguish them. Also I don't think they have the same statement for arrays. The analogy to other aggregates is probably a better thing to argue from. On the other hand, I don't know anyone outside the SQL standards committee who thinks it's actually a good idea that SUM() across no rows returns null rather than zero. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Sun, 2012-11-25 at 22:30 -0500, Tom Lane wrote: I'd be worried about the case of a lot of sessions touching a lot of unrelated tables. This change implies doubling the number of buffers that are held pinned by any given query, and the distributed overhead from that (eg, adding cycles to searches for free buffers) is what you ought to be afraid of. That's a good point. Doubling might be an exaggeration if indexes are involved, but it's still a concern. The cost of this might be difficult to measure though. Another possibly important point is that reducing the number of pin/unpin cycles for a given VM page might actually hurt the chances of it being found in shared buffers, because IIRC the usage_count is bumped once per pin/unpin. That algorithm is based on the assumption that buffer pins are not drastically different in lifespan, but I think you just broke that for pins on VM pages. If doing a bunch of simple key lookups using an index, then the root of the index page is only pinned once per query, but we expect that to stay in shared buffers. I see the VM page as about the same: one pin per query (or maybe a couple for large tables). I don't see how the lifetime of the pin matters a whole lot in this case; it's more about the rate of pins/unpins, right? I'm not particularly concerned about devising solutions for these problems, though, because I think this idea is a loser from the get-go; the increase in contention for VM pages is alone going to destroy any possible benefit. Your intuition here is better than mine, but I am still missing something here. If we keep the buffer pinned, then there will be very few pin/unpin cycles here, so I don't see where the contention would come from (any more than there is contention pinning the root of an index). Do you still think I need a shared lock here or something? If so, then index-only scans have a pretty big problem right now, too. I'll try to quantify some of these effects you've mentioned, and see how the numbers turn out. I'm worried that I'll need more than 4 cores to show anything though, so perhaps someone with a many-core box would be interested to test this out? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements : strange IS NULL behaviour
On 11/26/2012 09:05 PM, Tom Lane wrote: Hannu Krosing ha...@2ndquadrant.com writes: In some previous mail Tom Lane claimed that by SQL standard either an array of all NULLs or a record with all fields NULLs (I don't remember which) is also considered NULL. If this is true, then an empty array - which can be said to consist of nothing but NULLs - should itself be NULL. What I think you're referring to is that the spec says that foo IS NULL should return true if foo is a record containing only null fields. Is this requirement recursive ? That is , should ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL also be true ? Currently PostgreSQL does this kind of IS NULL for simple rows hannu=# SELECT ROW(NULL, NULL) IS NULL; ?column? -- t (1 row) and also for first level row types hannu=# SELECT ROW(NULL, ROW(NULL, NULL)) IS NULL; ?column? -- t (1 row) but then mysteriously stops working at third level hannu=# SELECT ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL; ?column? -- f (1 row) That's a fairly narrow statement. It does NOT say that NULL and (NULL,NULL,...) are indistinguishable for all purposes; only that this particular test doesn't distinguish them. Also I don't think they have the same statement for arrays. The analogy to other aggregates is probably a better thing to argue from. On the other hand, I don't know anyone outside the SQL standards committee who thinks it's actually a good idea that SUM() across no rows returns null rather than zero. Might be done in order to be in sync with other aggregates - for example the return NULL for no rows behaviour makes perfect sense for MIN(), AVG(), etc. Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements : strange IS NULL behaviour
Hannu Krosing ha...@2ndquadrant.com writes: On 11/26/2012 09:05 PM, Tom Lane wrote: The analogy to other aggregates is probably a better thing to argue from. On the other hand, I don't know anyone outside the SQL standards committee who thinks it's actually a good idea that SUM() across no rows returns null rather than zero. Might be done in order to be in sync with other aggregates - for example the return NULL for no rows behaviour makes perfect sense for MIN(), AVG(), etc. Well, if they'd made COUNT() of no rows return null, then I'd agree that they were pursuing consistency. As it stands, it's neither consistent nor very sensible. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Mon, Nov 26, 2012 at 3:29 PM, Jeff Davis pg...@j-davis.com wrote: Your intuition here is better than mine, but I am still missing something here. If we keep the buffer pinned, then there will be very few pin/unpin cycles here, so I don't see where the contention would come from (any more than there is contention pinning the root of an index). Based on previous measurements, I think there *is* contention pinning the root of an index. Currently, I believe it's largely overwhelmed by contention from other sources, such as the buffer manager lwlocks and the very-evil ProcArrayLock. However, I believe that as we fix those problems, this will start to percolate up towards the top of the heap. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On Mon, Nov 26, 2012 at 3:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: The analogy to other aggregates is probably a better thing to argue from. On the other hand, I don't know anyone outside the SQL standards committee who thinks it's actually a good idea that SUM() across no rows returns null rather than zero. Me neither. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
Jeff Davis pg...@j-davis.com writes: On Sun, 2012-11-25 at 22:30 -0500, Tom Lane wrote: Another possibly important point is that reducing the number of pin/unpin cycles for a given VM page might actually hurt the chances of it being found in shared buffers, because IIRC the usage_count is bumped once per pin/unpin. If doing a bunch of simple key lookups using an index, then the root of the index page is only pinned once per query, but we expect that to stay in shared buffers. I see the VM page as about the same: one pin per query (or maybe a couple for large tables). Hmmm ... that seems like a valid analogy. I may be worried about nothing as far as this point goes. Do you still think I need a shared lock here or something? If so, then index-only scans have a pretty big problem right now, too. There's still the issue of whether the IOS code is safe in machines with weak memory ordering. I think that it probably is safe, but the argument for it in the current code comment is wrong; most likely, a correct argument has to depend on read/write barriers associated with taking snapshots. I think what you ought to do is work through that, fix the existing comment, and then see whether the same argument works for what you want to do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] support for LDAP URLs
Peter Eisentraut wrote: Here is a patch to support RFC 2255 LDAP URLs in pg_hba.conf. So, instead of, say host ... ldap ldapserver=ldap.example.net ldapbasedn=dc=example, dc=net ldapsearchattribute=uid you could write host ... ldap lapurl=ldap://ldap.example.net/dc=example,dc=net?uid?sub; Should we be referencing RFC 4516 instead? I'm not very fond of the way this entry is worded: + varlistentry + termliteralldapurl/literal/term + listitem +para + You can write most of the LDAP options alternatively using an RFC 2255 + LDAP URL. The format is +synopsis +ldap://[replaceableuser/replaceable[:replaceablepassword/replaceable]@]replaceablehost/replaceable[:replaceableport/replaceable]/replaceablebasedn/replaceable[?[replaceableattribute/replaceable][?[replaceablescope/replaceable]]] +/synopsis + replaceablescope/replaceable must be one + of literalbase/literal, literalone/literal, literalsub/literal, + typically the latter. Only one attribute is used, and some other + components of standard LDAP URLs such as filters and extensions are + not supported. +/para It seems completely unlike the rest, and it doesn't read like a reference entry. How about starting with para containing just An RFC 4516 LDAP URL, or something like that, and then expanding on the details of the format outside the varlist? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Marko Tiikkaja wrote: On 15/11/2012 03:28, Kevin Grittner wrote: I have been testing the patch a bit Thanks! and I'm slightly disappointed by the fact that it still doesn't solve this problem (and I apologize if I have missed discussion about this in the docs or in this thread): assume foo is a non-empty materialized view T1: BEGIN; T1: LOAD MATERIALIZED VIEW foo; T2: SELECT * FROM foo; T1: COMMIT; T2 sees an empty table As far as I know you are the first to notice this behavior. Thanks for pointing it out. I agree with Robert that we have to be careful about scope creep, but this one looks to me like it should be considered a bug. IMO, LOAD for a populated MV should move it from one state which reflects the captured state of a previous point in time to a captured state which is later, with no invalid or inconsistent states visible in between. I will take a look at the issue; I don't know whether it's something small I can address in this CF or whether it will need to be in the next CF, but I will fix it. 19. LMV doesn't show a row count. It wouldn't be hard to add, it just seemed a little out of place to do that, when CLUSTER, etc., don't. This sounds like a useful feature, but your point about CLUSTER and friends still stands. Other possible arguments against providing a count are: (1) For a populated MV, the LOAD might be replacing the contents with fewer rows than were there before. (2) Once we have incremental updates of the MV, this count is only one of the ways to update the view -- and the others won't show counts. Showing it here might be considered inconsistent. I don't feel strongly about it, and I don't think it's a big change either way; just explaining what got me off the fence when I had to pick one behavior or the other to post the WIP patch. I'll get back when I manage to get a better grasp of the code. Thanks. Keep in mind that the current behavior of behaving like a regular view when the contents are invalid is not what I had in mind, that was an accidental effect of commenting out the body of the ExecCheckRelationsValid() function right before posting the patch because I noticed a regression. When I noticed current behavior, it struck me that someone might prefer it to the intended behavior of showing an error like this: ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg(materialized view \%s\ has not been populated, get_rel_name(rte-relid)), errhint(Use the LOAD MATERIALIZED VIEW command.))); I mention it in case someone wants to argue for silently behaving as a regular view when the MV is not populated. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
David Fetter wrote: On Mon, Nov 26, 2012 at 09:46:33AM -0500, Peter Eisentraut wrote: So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. Snapshot is one of the options for refreshing an Oracle materialized view. There are others, which we'll eventually add if past is any prologue :) That's the way I understand it, too. I hate to add to the bike-shedding, but we should probably add REFRESH SNAPSHOT as an optional piece of the grammar, with more REFRESH options to come. I would prefer to leave the syntax for refreshing MVs to a later patch. I'm kind of assuming that SNAPSHOT would be the default if no other options are specified, but I would prefer not to get into too much speculation about add-on patches for fear of derailing this initial effort. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Mon, Nov 26, 2012 at 04:34:36PM -0500, Kevin Grittner wrote: David Fetter wrote: On Mon, Nov 26, 2012 at 09:46:33AM -0500, Peter Eisentraut wrote: So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. Snapshot is one of the options for refreshing an Oracle materialized view. There are others, which we'll eventually add if past is any prologue :) That's the way I understand it, too. Great :) I hate to add to the bike-shedding, but we should probably add REFRESH SNAPSHOT as an optional piece of the grammar, with more REFRESH options to come. I would prefer to leave the syntax for refreshing MVs to a later patch. I'm kind of assuming that SNAPSHOT would be the default if no other options are specified, but I would prefer not to get into too much speculation about add-on patches for fear of derailing this initial effort. You're right. I withdraw my suggestion until such time as this patch (or descendent) is committed and actual working code implementing other refresh strategies is written. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Andres Freund and...@2ndquadrant.com writes: On 2012-10-31 11:41:37 +0530, Amit Kapila wrote: There seems to be a problem in behavior of Create Index Concurrently and Hot Update in HEAD code . At pgcon.it I had a chance to discuss with Simon how to fix this bug. Please check the attached patches - and their commit messages - for the fix and some regression tests. I looked at this a bit. I am very unhappy with the proposed kluge to open indexes NoLock in some places. Even if that's safe today, which I don't believe in the least, any future change in this area could break it. I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an additional step that somehow marks the pg_index row in a new way that makes RelationGetIndexList ignore it, and then wait out existing transactions before taking the final step of dropping the index. The Right Way to do this would likely have been to add another bool column, but we don't have that option anymore in 9.2. Maybe we could abuse indcheckxmin somehow, ie use a state that can't otherwise occur (in combination with the values of indisvalid/indisready) to denote an index being dropped. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion for --truncate-tables to pg_restore
On 11/26/2012 12:06:56 PM, Robert Haas wrote: On Wed, Nov 21, 2012 at 12:53 AM, Josh Kupershmidt schmi...@gmail.com wrote: TBH, I didn't find the example above particularly compelling for demonstrating the need for this feature. If you've just got one table with dependent views which needs to be restored, it's pretty easy to manually TRUNCATE and have pg_restore --data-only reload the table. (And easy enough to combine the truncate and restore into a single transaction in case anything goes wrong, if need be.) But I'm willing to grant that this proposed feature is potentially as useful as existing restore-jiggering options like --disable-triggers. And I guess I could see that if you're really stuck having to perform a --data-only restore of many tables, this feature could come in handy. I think I would come down on the other side of this. We've never really been able to get --clean work properly in all scenarios, and it seems likely that a similar fate will befall this option. Where I would like to go with this is to first introduce, as a new patch, an ALTER TABLE option to disable a constraint. Something like ALTER TABLE foo UNVALIDATE CONSTRAINT constraintname; This would mark the constraint NOT VALID, as if the constraint were created with the NOT VALID option. After a constraint is UNVALIDATEd the existing ALTER TABLE foo VALIDATE CONSTRAINT constraintname; feature would turn the constraint on and check the data. With UNVALIDATE CONSTRAINT, pg_restore could first turn off all the constraints concerning tables to be restored, truncate the tables, restore the data, turn the constraints back on and re-validate the constraints. No need to worry about ordering based on a FK referential dependency graph or loops in such a graph (due to DEFERRABLE INITIALLY DEFERRED). This approach would allow the content of a table or tables to be restored regardless of dependent objects or FK references and preserve FK referential integrity. Right? I need some guidance here from someone who knows more than I do. There would likely need to be a pg_restore option like --disable-constraints to invoke this functionality, but that can be worked out later. Likewise, I see an update and a delete trigger in pg_triggers associated with the referenced table in REFERENCES constraints, but no trigger for truncate. So making a constraint NOT VALID may not be as easy as it seems. I don't know what the problems are with --clean but I would like to know if this appears a promising approach. If so I can pursue it, although I make no promises. (I sent in the --disable-triggers patch because it seemed easy and I'm not sure where a larger project fits into my life.) Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein P.S. An outstanding question regards --truncate-tables is whether it should drop indexes before truncate and re-create them after restore. Sounds like it should. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Mon, 2012-11-26 at 16:10 -0500, Tom Lane wrote: There's still the issue of whether the IOS code is safe in machines with weak memory ordering. I think that it probably is safe, but the argument for it in the current code comment is wrong; most likely, a correct argument has to depend on read/write barriers associated with taking snapshots. I think what you ought to do is work through that, fix the existing comment, and then see whether the same argument works for what you want to do. As a part of the patch, I did change the comment, and here's what I came up with: * Note on Memory Ordering Effects: visibilitymap_test does not lock * the visibility map buffer, and therefore the result we read here * could be slightly stale. However, it can't be stale enough to * matter. * * We need to detect clearing a VM bit due to an insert right away, * because the tuple is present in the index page but not visible. The * reading of the TID by this scan (using a shared lock on the index * buffer) is serialized with the insert of the TID into the index * (using an exclusive lock on the index buffer). Because the VM bit is * cleared before updating the index, and locking/unlocking of the * index page acts as a full memory barrier, we are sure to see the * cleared bit if we see a recently-inserted TID. * * Deletes do not update the index page (only VACUUM will clear out the * TID), so the clearing of the VM bit by a delete is not serialized * with this test below, and we may see a value that is significantly * stale. However, we don't care about the delete right away, because * the tuple is still visible until the deleting transaction commits or * the statement ends (if it's our transaction). In either case, the * lock on the VM buffer will have been released (acting as a write * barrier) after clearing the bit. And for us to have a snapshot that * includes the deleting transaction (making the tuple invisible), we * must have acquired ProcArrayLock after that time, acting as a read * barrier. * * It's worth going through this complexity to avoid needing to lock * the VM buffer, which could cause significant contention. And I updated the comment in visibilitymap.c as well (reformatted for this email): To test a bit in the visibility map, most callers should have a pin on the VM buffer, and at least a shared lock on the data buffer. Any process that clears the VM bit must have an exclusive lock on the data buffer, so that will serialize access to the appropriate bit. Because lock acquisition and release are full memory barriers, then there is no danger of seeing the state of the bit before it was last cleared. Callers that don't have the data buffer yet, such as an index only scan or a VACUUM that is skipping pages, must handle the concurrency as appropriate. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Wed, Nov 14, 2012 at 10:08:15AM -0500, Bruce Momjian wrote: I agree that parallel restore for schemas is a hard problem. But I didn't mean parallelism within the restore, I meant that we could start both postmasters and pipe the output from dump directly to restore. This way the times for dumping and restoring can overlap. Wow, that is a very creative idea. The current code doesn't do that, but this has the potential of doubling pg_upgrade's speed, without adding a lot of complexity. Here are the challenges of this approach: * I would need to log the output of pg_dumpall as it is passed to psql so users can debug problems * pg_upgrade never runs the old and new clusters at the same time for fear that it will run out of resources, e.g. shared memory, or if they are using the same port number. We can make this optional and force different port numbers. Let me work up a prototype in the next few days and see how it performs. Thanks for the great idea. I have developed the attached proof-of-concept patch to test this idea. Unfortunately, I got poor results: pg_upgrade dump restore dmp|res git dmp/res 1 0.12 0.07 0.13 11.16 13.03 1000 3.80 2.83 5.46 18.78 20.27 2000 5.39 5.65 13.99 26.78 28.54 400016.08 12.40 28.34 41.90 44.03 800032.77 25.70 57.97 78.61 80.09 1600057.67 63.42134.43158.49165.78 32000 131.84176.27302.85380.11389.48 64000 270.37708.30 1004.39 1085.39 1094.70 The last two columns show the patch didn't help at all, and the third column shows it is just executing the pg_dump, then the restore, not in parallel, i.e. column 1 + column 2 ~= column 3. Testing pg_dump for 4k tables (16 seconds) shows the first row is not output by pg_dump until 15 seconds, meaning there can't be any parallelism with a pipe. (Test script attached.) Does anyone know how to get pg_dump to send some output earlier? In summary, it doesn't seem pg_dump makes any attempt to output its data early. pg_dump.c has some details: /* * And finally we can do the actual output. * * Note: for non-plain-text output formats, the output file is written * inside CloseArchive(). This is, um, bizarre; but not worth changing * right now. */ if (plainText) RestoreArchive(fout); CloseArchive(fout); FYI, log_min_duration_statement shows queries taking 11.2 seconds, even without the network overhead --- not sure how that can be optimized. I will now test using PRIMARY KEY and custom dump format with pg_restore --jobs to see if I can get parallelism that way. A further parallelism would be to allow multiple database to be dump/restored at the same time. I will test for that once this is done. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c new file mode 100644 index 577ccac..dbdf9a5 *** a/contrib/pg_upgrade/dump.c --- b/contrib/pg_upgrade/dump.c *** generate_old_dump(void) *** 24,30 * restores the frozenid's for databases and relations. */ exec_prog(UTILITY_LOG_FILE, NULL, true, ! \%s/pg_dumpall\ %s --schema-only --binary-upgrade %s -f %s, new_cluster.bindir, cluster_conn_opts(old_cluster), log_opts.verbose ? --verbose : , ALL_DUMP_FILE); --- 24,30 * restores the frozenid's for databases and relations. */ exec_prog(UTILITY_LOG_FILE, NULL, true, ! \%s/pg_dumpall\ %s --schema-only --globals-only --binary-upgrade %s -f %s, new_cluster.bindir, cluster_conn_opts(old_cluster), log_opts.verbose ? --verbose : , ALL_DUMP_FILE); *** generate_old_dump(void) *** 47,63 void split_old_dump(void) { ! FILE *all_dump, ! *globals_dump, ! *db_dump; ! FILE *current_output; char line[LINE_ALLOC]; bool start_of_line = true; char create_role_str[MAX_STRING]; char create_role_str_quote[MAX_STRING]; char filename[MAXPGPATH]; - bool suppressed_username = false; - /* * Open all files in binary mode to avoid line end translation on Windows, --- 47,58 void split_old_dump(void) { ! FILE *all_dump, *globals_dump; char line[LINE_ALLOC]; bool start_of_line = true; char create_role_str[MAX_STRING]; char create_role_str_quote[MAX_STRING]; char filename[MAXPGPATH]; /* * Open all files in binary mode to avoid line end translation on Windows, *** split_old_dump(void) *** 70,80 snprintf(filename,
Re: [HACKERS] New statistics for WAL buffer dirty writes
Satoshi Nagayasu escribió: I attached the latest one, which splits the reset_time for bgwriter and walwriter, and provides new system view, called pg_stat_walwriter, to show the dirty write counter and the reset time. Thanks. I gave this a look and I have a couple of comments: 1. The counter is called dirty_write. I imagine that this comes directly from the name of the nearby DTrace probe, TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_START. That probe comes from email 494c1565.3060...@sun.com committed in 4ee79fd20d9a. But there wasn't much discussion about the name back then. Maybe that was fine at the time because it was pretty much an internal matter, being so deep in the code. But we're now going to expose it to regular users, so we'd better choose a very good name because we're going to be stuck with it for a very long time. And the name dirty doesn't ring with me too well; what matters here is not that we're writing a buffer that is dirty, but the fact that we're writing while holding the WalInsertLock, so the name should convey the fact that this is a locked write or something like that. Or at least that's how I see the issue. Note the documentation wording: + entrystructfielddirty_writes//entry + entrytypebigint/type/entry + entryNumber of dirty writes, which means flushing wal buffers + because of its full./entry 2. Should we put bgwriter and walwriter data in separate structs? I don't think this is necessary, but maybe someone opines differently? 3. +/* + * WalWriter global statistics counter. + * Despite its name, this counter is actually used not only in walwriter, + * but also in each backend process to sum up xlog dirty writes. + * Those processes would increment this counter in each XLogWrite call, + * then send it to the stat collector process. + */ +PgStat_MsgWalWriter WalWriterStats; Maybe we should use a different name for the struct, to avoid having to excuse ourselves for the name being wrong ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion for --truncate-tables to pg_restore
On Mon, Nov 26, 2012 at 4:51 PM, Karl O. Pinc k...@meme.com wrote: Where I would like to go with this is to first introduce, as a new patch, an ALTER TABLE option to disable a constraint. Something like ALTER TABLE foo UNVALIDATE CONSTRAINT constraintname; This doesn't really make sense, because constraints that are not validated are still enforced for new rows. This thus wouldn't save anything performance-wise. We should perhaps have two more states: not enforced but blindly assumed true, and not enforced and not assumed true either. But currently, we don't. I don't know what the problems are with --clean but I would like to know if this appears a promising approach. If so I can pursue it, although I make no promises. (I sent in the --disable-triggers patch because it seemed easy and I'm not sure where a larger project fits into my life.) I'm not really sure what the issues were any more; but I think they may have had to do with dependencies between different objects messing things up, which I think is likely to be a problem for this proposal as well. P.S. An outstanding question regards --truncate-tables is whether it should drop indexes before truncate and re-create them after restore. Sounds like it should. Well, that would improve performance, but it also makes the behavior of object significantly different from what one might expect from the name. One of the problems here is that there seem to be a number of slightly-different things that one might want to do, and it's not exactly clear what all of them are, or whether a reasonable number of options can cater to all of them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Mon, Nov 26, 2012 at 3:03 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 26, 2012 at 3:29 PM, Jeff Davis pg...@j-davis.com wrote: Your intuition here is better than mine, but I am still missing something here. If we keep the buffer pinned, then there will be very few pin/unpin cycles here, so I don't see where the contention would come from (any more than there is contention pinning the root of an index). Based on previous measurements, I think there *is* contention pinning the root of an index. Currently, I believe it's largely overwhelmed by contention from other sources, such as the buffer manager lwlocks and the very-evil ProcArrayLock. However, I believe that as we fix those problems, this will start to percolate up towards the top of the heap. Yup -- it (buffer pin contention on high traffic buffers) been caught in the wild -- just maintaining the pin count was enough to do it in at least one documented case. Pathological workloads demonstrate contention today and there's no good reason to assume it's limited index root nodes -- i'm strongly suspicious buffer spinlock issues are behind some other malfeasance we've seen recently. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failing SSL connection due to weird interaction with openssl
Lars Kanis wrote: While investigating a ruby-pg issue [1], we noticed that a libpq SSL connection can fail, if the running application uses OpenSSL for other work, too. Root cause is the thread local error queue of OpenSSL, that is used to transmit textual error messages to the application after a failed crypto operation. In case that the application leaves errors on the queue, the communication to the PostgreSQL server can fail with a message left from the previous failed OpenSSL operation, in particular when using non-blocking operations on the socket. This issue with openssl is quite old now - see [3]. I gather that this is supposed to be back-patched to all supported branches. [3] http://www.educatedguesswork.org/movabletype/archives/2005/03/curse_you_opens.html This link is dead. Here's one that works: http://www.educatedguesswork.org/2005/03/curse_you_opens.html -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further pg_upgrade analysis for many tables
Bruce Momjian br...@momjian.us writes: Testing pg_dump for 4k tables (16 seconds) shows the first row is not output by pg_dump until 15 seconds, meaning there can't be any parallelism with a pipe. (Test script attached.) Does anyone know how to get pg_dump to send some output earlier? You can't. By the time it knows what order to emit the objects in, it's done all the preliminary work you're griping about. (In a dump with data, there would be a meaningful amount of computation remaining, but not in a schema-only dump.) I will now test using PRIMARY KEY and custom dump format with pg_restore --jobs to see if I can get parallelism that way. This seems likely to be a waste of effort for the same reason: you only get meaningful parallelism when there's a substantial data component to be restored. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Mon, 2012-11-26 at 09:46 -0500, Peter Eisentraut wrote: On 11/14/12 9:28 PM, Kevin Grittner wrote: 17. Since the data viewed in an MV is not up-to-date with the latest committed transaction, So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. OK, I take everything back and claim the opposite. In current Oracle, SNAPSHOT is an obsolete alias for MATERIALIZED VIEW. Materialized views have the option of REFRESH ON DEMAND and REFRESH ON COMMIT, with the former being the default. So it seems that the syntax of what you are proposing is in line with Oracle. I'm not fond of overloading LOAD as the refresh command. Maybe you could go the Oracle route here as well and use a stored procedure. That would also allow things like SELECT pg_refresh_mv(oid) FROM ... more easily. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion for --truncate-tables to pg_restore
On Mon, Nov 26, 2012 at 3:42 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 26, 2012 at 4:51 PM, Karl O. Pinc k...@meme.com wrote: P.S. An outstanding question regards --truncate-tables is whether it should drop indexes before truncate and re-create them after restore. Sounds like it should. Well, that would improve performance, but it also makes the behavior of object significantly different from what one might expect from the name. One of the problems here is that there seem to be a number of slightly-different things that one might want to do, and it's not exactly clear what all of them are, or whether a reasonable number of options can cater to all of them. Another problem: attempting to drop a unique constraint or primary key (if we're counting these as indexes to be dropped and recreated, which they should be if the goal is reasonable restore performance) which is referenced by another table's foreign key will cause: ERROR: cannot drop constraint xxx on table yyy because other objects depend on it and as discussed upthread, it would be impolite for pg_restore to presume it should monkey with dropping+recreating other tables' constraints to work around this problem, not to mention impossible when pg_restore is not connected to the target database. It is a common administrative task to selectively restore some existing tables' contents from a backup, and IIRC was the impetus for this patch. Instead of adding a bunch of options to pg_restore, perhaps a separate tool specific to this task would be the way to go. It could handle the minutiae of truncating, dropping and recreating constraints and indexes of the target tables, and dealing with FKs sensibly, without worrying about conflicts with existing pg_restore options and behavior. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failing SSL connection due to weird interaction with openssl
Alvaro Herrera alvhe...@2ndquadrant.com writes: I gather that this is supposed to be back-patched to all supported branches. FWIW, I don't like that patch any better than Robert does. It seems as likely to do harm as good. If there are places where libpq itself is leaving entries on the error stack, we should fix them -- retail. But it's not our business to clobber global state because there might be bugs in some other part of an application. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion for --truncate-tables to pg_restore
On 11/26/2012 08:45:08 PM, Josh Kupershmidt wrote: On Mon, Nov 26, 2012 at 3:42 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 26, 2012 at 4:51 PM, Karl O. Pinc k...@meme.com wrote: P.S. An outstanding question regards --truncate-tables is whether it should drop indexes before truncate and re-create them after restore. Sounds like it should. Well, that would improve performance, but it also makes the behavior of object significantly different from what one might expect from the name. One of the problems here is that there seem to be a number of slightly-different things that one might want to do, and it's not exactly clear what all of them are, or whether a reasonable number of options can cater to all of them. Another problem: attempting to drop a unique constraint or primary key (if we're counting these as indexes to be dropped and recreated, which they should be if the goal is reasonable restore performance) which is referenced by another table's foreign key will cause: ERROR: cannot drop constraint xxx on table yyy because other objects depend on it and as discussed upthread, it would be impolite for pg_restore to presume it should monkey with dropping+recreating other tables' constraints to work around this problem, not to mention impossible when pg_restore is not connected to the target database. I'm thinking impossible because it's impossible to know what the existing FKs are without a db connection. Impossible is a problem. You may have another reason why it's impossible. It is a common administrative task to selectively restore some existing tables' contents from a backup, and IIRC was the impetus for this patch. Yes. (And aside from listing tables individually it'd be nice to restore tables per schema.) It's also a bit surprising that restoring table content is so hard/unsupported, given a db of more than minimal complexity. Instead of adding a bunch of options to pg_restore, perhaps a separate tool specific to this task would be the way to go. It could handle the minutiae of truncating, dropping and recreating constraints and indexes of the target tables, and dealing with FKs sensibly, without worrying about conflicts with existing pg_restore options and behavior. Per above, the tool would then either require a db connection or at least a dump which contains the system catalogs. I'm afraid I don't have a clear picture of what such a tool would look like, if it does not look a lot like pg_restore. I would like to have such a tool. I'm not certain how much I'd be able to contribute toward making one. Meanwhile it sounds like the --truncate-tables patch is looking less and less desirable. I'm ready for rejection, but will soldier on in the interest of not wasting other people work on this, if given direction to move forward. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
I wrote: I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an additional step that somehow marks the pg_index row in a new way that makes RelationGetIndexList ignore it, and then wait out existing transactions before taking the final step of dropping the index. The Right Way to do this would likely have been to add another bool column, but we don't have that option anymore in 9.2. Maybe we could abuse indcheckxmin somehow, ie use a state that can't otherwise occur (in combination with the values of indisvalid/indisready) to denote an index being dropped. I looked into this some more. There are currently three possible states of the indisvalid/indisready flags: indisvalid = false, indisready = false initial state during CREATE INDEX CONCURRENTLY; this should result in sessions honoring the index for HOT-safety decisions, but not using it for searches or insertions indisvalid = false, indisready = true sessions should now insert into the index, but not use it for searches indisvalid = true, indisready = true normal, fully valid index Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Obviously, an additional flag column would be a much cleaner fix, and I think we should add one in HEAD. But it's too late to do that in 9.2. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-10-05 19:56:40 -0400, Tom Lane wrote: I think this could possibly be fixed by using nontransactional update-in-place when we're trying to change indisvalid and/or indisready, but I've not really thought through the details. I couldn't really think of any realistic method to fix this other than update in place. I thought about it for a while and I think it should work, but I have to say it makes me slightly uneasy. I looked through the code a bit, and I think we should be able to make this work, but note there are also places that update indcheckxmin using heap_update, and that's just as dangerous as changing the other two flags via heap_update, if you don't have exclusive lock on the table. This is going to need some careful thought, because we currently expect that such an update will set the pg_index row's xmin to the current XID, which is something an in-place update can *not* do. I think this is a non-problem during construction of a new index, since the xmin ought to be the current XID already anyway, but it's less clear what to do in REINDEX. In the short term there may be no problem since REINDEX takes exclusive lock on the parent table anyway (and hence could safely do a transactional update) but if we ever want REINDEX CONCURRENTLY we'll need a better answer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion for --truncate-tables to pg_restore
On 11/26/2012 09:30:48 PM, Karl O. Pinc wrote: On 11/26/2012 08:45:08 PM, Josh Kupershmidt wrote: It is a common administrative task to selectively restore some existing tables' contents from a backup, and IIRC was the impetus for this patch. Yes. (And aside from listing tables individually it'd be nice to restore tables per schema.) As long as I'm daydreaming it'd be nice to be able to restore a table, data and schema, and have available the various combinations of: new table name, different owner, different schema, different db. Without having to edit a file by hand. Of course I've not done the brain work involved in figuring out just what this would mean in terms of related objects like triggers, constraints, indexes and so forth. But then who doesn't want a pony? :-) Regards, Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
On Mon, 2012-11-26 at 13:14 -0500, Bruce Momjian wrote: I would normally agree with this analysis, but pg_ctl -w certainly need this ping functionality, so it kind of makes sense that others might need it too. Sure, PQping is useful for this very specific use case of seeing whether the server has finished starting up. If someone came with a plausible use case for a startup script that couldn't use pg_ctl but wanted ping functionality available in a shell script, then pg_ping could be provided. But that would also determine what options to provide. For example, we might not need repeated ping in that case. But I think people see PQping and will see pg_ping as a monitoring facility, and I think that's a mistake. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transform mapped XPath expressions into XML containing relational data
On Mon, 2012-11-26 at 09:26 -0800, Thangalin wrote: Is it possible to build an XML (or JSON) document using a map database columns to XPath expressions? For example: rootpeople person person person.first_name - name/first person.last_name - name/last person.age- [@age] account.person_id = person.person_id account person/account account.number- [@id] I would go about this by using table_to_xml or query_to_xml and then converting the resulting XML document to your particular format using XSLT. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On Tue, Nov 27, 2012 at 12:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an additional step that somehow marks the pg_index row in a new way that makes RelationGetIndexList ignore it, and then wait out existing transactions before taking the final step of dropping the index. The Right Way to do this would likely have been to add another bool column, but we don't have that option anymore in 9.2. Maybe we could abuse indcheckxmin somehow, ie use a state that can't otherwise occur (in combination with the values of indisvalid/indisready) to denote an index being dropped. I looked into this some more. There are currently three possible states of the indisvalid/indisready flags: indisvalid = false, indisready = false initial state during CREATE INDEX CONCURRENTLY; this should result in sessions honoring the index for HOT-safety decisions, but not using it for searches or insertions indisvalid = false, indisready = true sessions should now insert into the index, but not use it for searches indisvalid = true, indisready = true normal, fully valid index Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Obviously, an additional flag column would be a much cleaner fix, and I think we should add one in HEAD. But it's too late to do that in 9.2. For 9.2 as you say the best choice is to ignore in RelationGetIndexList the indexes that are valid and not ready when fetching the index list of a relation. For the fix in master, just thinking but... Having 3 boolean flags to manage the state of an index might easily lead to the developer to confusion. I was thinking about the possibility of using instead of that a list of states that are defined with a character. We already know 3 possible states which are: - indisvalid = false, indisready = false, let's say INDEX_IS_INITIAL 'i' - indisvalid = false, indisready = true, with INDEX_IS_READY 'r' - indisvalid = true, indisready = true, with INDEX_IS_VALID 'v' In case an index needs to be ignored as you mention with the combination indisvalid = true and indisready = false, it could be possible to add a new state like INDEX_IS_IGNORE 'g'. This would avoid the addition of a new column in pg_index and control the status of an index easily. This is not that much backward-compatible though... -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Materialized views WIP patch
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Peter Eisentraut Sent: 27 November 2012 13:35 To: Kevin Grittner Cc: Pgsql Hackers Subject: Re: [HACKERS] Materialized views WIP patch On Mon, 2012-11-26 at 09:46 -0500, Peter Eisentraut wrote: On 11/14/12 9:28 PM, Kevin Grittner wrote: 17. Since the data viewed in an MV is not up-to-date with the latest committed transaction, So, the way I understand it, in Oracle terms, this feature is a snapshot, not a materialized view. Maybe that's what it should be called then. OK, I take everything back and claim the opposite. In current Oracle, SNAPSHOT is an obsolete alias for MATERIALIZED VIEW. Materialized views have the option of REFRESH ON DEMAND and REFRESH ON COMMIT, with the former being the default. So it seems that the syntax of what you are proposing is in line with Oracle. I'm not fond of overloading LOAD as the refresh command. Maybe you could go the Oracle route here as well and use a stored procedure. That would also allow things like SELECT pg_refresh_mv(oid) FROM ... more easily. +1 to this. I can see a use case where you might want to refresh all MVs that are X number of days/hours old. Rather than having to execute statements for each one. Something like pg_refresh_mv() within a query would allow this. Pretty exciting work Kevin, I understand what Robert said about feature creep etc and agree 100%, but I'm really looking forward to when we can *one day* have the planner make use of an eager MV to optimise a query! Regards David Rowley -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false I suggest that to fix this for 9.2, we could abuse these flags by defining that combination as meaning ignore this index completely, and having DROP INDEX CONCURRENTLY set this state during its final wait before removing the index. Yeah, this looks much better, given our inability to add a new catalog column in 9.2. Can we cheat a little though and use a value other than 0 and 1 for indisvalid or indisready to tell the server to interpret it differently ? I would assume not, but can't see a reason unless these values are converted to other types and back to boolean. Andres complained about the fact that many callers of RelationGetIndexList are probably not ready to process invalid or not-yet-ready indexes and suggested that those changes should be backpatched to even older releases. But IMO we should do that with a test case that demonstrates that there is indeed a bug. Also, we should teach RelationGetIndexList to take a flags argument and filter out indexes that the caller is not interested instead of every caller doing the checks separately. Thanks, Pavan
[HACKERS] ilist.h fails cpluspluscheck
In file included from ./src/include/utils/catcache.h:25:0, from /tmp/cpluspluscheck.bt8VZr/test.cpp:3: src/include/lib/ilist.h: In function ‘dlist_node* dlist_head_node(dlist_head*)’: src/include/lib/ilist.h:470:39: error: invalid conversion from ‘void*’ to ‘dlist_node*’ [-fpermissive] src/include/lib/ilist.h: In function ‘dlist_node* dlist_tail_node(dlist_head*)’: src/include/lib/ilist.h:487:39: error: invalid conversion from ‘void*’ to ‘dlist_node*’ [-fpermissive] In file included from ./src/include/utils/catcache.h:25:0, from /tmp/cpluspluscheck.bt8VZr/test.cpp:3: src/include/lib/ilist.h: In function ‘slist_node* slist_head_node(slist_head*)’: src/include/lib/ilist.h:680:39: error: invalid conversion from ‘void*’ to ‘slist_node*’ [-fpermissive] In file included from /tmp/cpluspluscheck.bt8VZr/test.cpp:3:0: ./src/include/lib/ilist.h: In function ‘dlist_node* dlist_head_node(dlist_head*)’: ./src/include/lib/ilist.h:470:39: error: invalid conversion from ‘void*’ to ‘dlist_node*’ [-fpermissive] ./src/include/lib/ilist.h: In function ‘dlist_node* dlist_tail_node(dlist_head*)’: ./src/include/lib/ilist.h:487:39: error: invalid conversion from ‘void*’ to ‘dlist_node*’ [-fpermissive] In file included from /tmp/cpluspluscheck.bt8VZr/test.cpp:3:0: ./src/include/lib/ilist.h: In function ‘slist_node* slist_head_node(slist_head*)’: ./src/include/lib/ilist.h:680:39: error: invalid conversion from ‘void*’ to ‘slist_node*’ [-fpermissive] Maybe some ifndef __cplusplus would help. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Pavan Deolasee pavan.deola...@gmail.com writes: On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: Either state of indcheckxmin is valid with all three of these combinations, so the specific kluge I was contemplating above doesn't work. But there is no valid reason for an index to be in this state: indisvalid = true, indisready = false Yeah, this looks much better, given our inability to add a new catalog column in 9.2. Can we cheat a little though and use a value other than 0 and 1 for indisvalid or indisready to tell the server to interpret it differently ? No, not unless you'd like select * from pg_index to misbehave. Personally, I like being able to look at catalogs. Andres complained about the fact that many callers of RelationGetIndexList are probably not ready to process invalid or not-yet-ready indexes and suggested that those changes should be backpatched to even older releases. But IMO we should do that with a test case that demonstrates that there is indeed a bug. Also, we should teach RelationGetIndexList to take a flags argument and filter out indexes that the caller is not interested instead of every caller doing the checks separately. We can't really change the API of RelationGetIndexList in the back branches, as it's just about certain there is third-party code calling it. I'm dubious about the advantages of such prefiltering anyway, as: (1) That would mean that every change to indisvalid/indisready would require forcing a relcache flush on the parent table. (Either that or RelationGetIndexList does pg_index lookups internally, which would mostly defeat the point of caching the index list at all.) (2) The big picture here is that it's silly to optimize for any case other than fully valid indexes, because anything else can be expected to be a low-probability transient state. So generally we might as well have RelationGetIndexList return all the indexes and let callers filter any that happen to be inappropriate for their usage. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ilist.h fails cpluspluscheck
Peter Eisentraut pete...@gmx.net writes: In file included from ./src/include/utils/catcache.h:25:0, from /tmp/cpluspluscheck.bt8VZr/test.cpp:3: src/include/lib/ilist.h: In function âdlist_node* dlist_head_node(dlist_head*)â: src/include/lib/ilist.h:470:39: error: invalid conversion from âvoid*â to âdlist_node*â [-fpermissive] Maybe some ifndef __cplusplus would help. Or maybe we need to recommend use of -fpermissive? If C++ thinks casting void * to something else is illegitimate, it's basically not going to cope with most things we might try to inline in Postgres. And I don't think that saying you don't get to call these fundamental support functions from C++ is likely to fly, so just hiding the functions won't help much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers