Re: [HACKERS] Convert pltcl from strings to objects
On 2/25/16 9:30 AM, Alvaro Herrera wrote: Jim Nasby wrote: Here we have another case. prodesc is a global thing. And it is shared between different operations. Problem was that there is no partcular owner, and we have to wait when last operation which deals with it would finish. It looks like perfect job for reference counting. I've just tried to wrap my head around what's going on with prodesc and failed... specifically, I don't understand this claim in the comment: * Add the proc description block to the hashtable. Note we do not * attempt to free any previously existing prodesc block. !!This is * annoying, but necessary since there could be active calls using * the old prodesc.!! What else could be referencing it? I realize it's stored in pltcl_proc_htab, but AFAICT that's backend-local. So I don't understand what else could be referencing it. Try to open a cursor that uses the function, fetch a few tuples from it; then change the function and fetch more rows from the cursor. I suppose the open cursor could contain a reference to the function's prodesc. Refcounting the prodesc would let it live until the cursor's closed, then free it. Hadn't thought about cursors; I suspect you're right about that. I wounder if other PLs would handle that correctly. I'm also not sure how the reference would get decremented... via ResourceOwner somehow? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sanity checking for ./configure options?
On 2/26/16 9:34 AM, Ivan Kartyshov wrote: The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, failed Spec compliant: tested, failed Documentation:tested, failed Tested, I think it`s rather important to make cleanup work on that project. Did you mean to mark all those items as tested, failed? On another note, the other use case for allowing 1-1024 is if you run with listen_address=''. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sanity checking for ./configure options?
On 2/26/16 9:29 PM, Peter Eisentraut wrote: To make this really robust, you might need to do pattern matching on the value. Yeah, and I don't see any reasonable way to do that... we don't require sed or the like, do we? I'll look at the other things you mentioned. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Improve error handling in pltcl
Per discussion in [1], this patch improves error reporting in pltcl. pltcl_error_objects.patch applies on top of the pltcl_objects_2.patch referenced in [2]. pltcl_error_master.patch applies against current master. [1] http://www.postgresql.org/message-id/20160223150401.2173d...@wagner.wagner.home [2] http://www.postgresql.org/message-id/56cce7d2.9090...@bluetreble.com -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/pltcl.sgml b/doc/src/sgml/pltcl.sgml index d2175d5..d5c576d 100644 --- a/doc/src/sgml/pltcl.sgml +++ b/doc/src/sgml/pltcl.sgml @@ -775,6 +775,127 @@ CREATE EVENT TRIGGER tcl_a_snitch ON ddl_command_start EXECUTE PROCEDURE tclsnit + +Error Handling in PL/Tcl + + + error handling + in PL/Tcl + + + + All Tcl errors that are allowed to propagate back to the top level of the + interpreter, that is, errors not caught within the stored procedure + using the Tcl catch command will raise a database + error. + + + Tcl code within or called from the stored procedure can choose to + raise a database error by invoking the elog + command provided by PL/Tcl or by generating an error using the Tcl + error command and not catching it with Tcl's + catch command. + + + Database errors that occur from the PL/Tcl stored procedure's + use of spi_exec, spi_prepare, + and spi_execp are also catchable by Tcl's + catch command. + + + Tcl provides an errorCode variable that can + represent additional information about the error in a form that + is easy for programs to interpret. The contents are in Tcl list + format and the first word identifies the subsystem or + library responsible for the error and beyond that the contents are left + to the individual code or library. For example if Tcl's + open command is asked to open a file that doesn't + exist, errorCode + might contain POSIX ENOENT {no such file or directory} + where the third element may vary by locale but the first and second + will not. + + + When spi_exec, spi_prepare + or spi_execp cause a database error to be raised, + that database eror propagates back to Tcl as a Tcl error. + In this case errorCode is set to a list + where the first element is POSTGRES followed by a + copious decoding of the Postgres error structure. Since fields in the + structure may or may not be present depending on the nature of the + error, how the function was invoked, etc, PL/Tcl has adopted the + convention that subsequent elements of the errorCode + list are key-value pairs where the first value is the name of the + field and the second is its value. + + + Fields that may be present include message, + detail, detail_log, + hint, domain, + context_domain, context, + schema, table, + column, datatype, + constraint, cursor_position, + internalquery, internal_position, + filename, lineno and + funcname. + + + You might find it useful to load the results into an array. Code + for doing that might look like + +if {[lindex $errorCode 0] == "POSTGRES"} { +array set errorRow [lrange $errorCode 1 end] +} + + + + In the example below we cause an error by attempting to + SELECT from a table that doesn't exist. + +select tcl_eval('spi_exec "select * from foo;"'); + + + +ERROR: relation "foo" does not exist + + + + + Now we examine the error code. (The double-colons explicitly + reference errorCode as a global variable.) + +select tcl_eval('join $::errorCode "\n"'); + + + + tcl_eval +--- + POSTGRES + + message + + relation "foo" does not exist+ + domain + + postgres-9.6 + + context_domain + + postgres-9.6 + + cursorpos+ + 0+ + internalquery+ + select * from foo; + + internalpos + + 15 + + filename + + parse_relation.c + + lineno + + 1159 + + funcname + + parserOpenTable +(1 row) + + + + + Modules and the unknown Command diff --git a/src/pl/tcl/expected/pltcl_setup.out b/src/pl/tcl/expected/pltcl_setup.out index 4183c14..0a9f9f4 100644 --- a/src/pl/tcl/expected/pltcl_setup.out +++ b/src/pl/tcl/expected/pltcl_setup.out @@ -542,3 +542,44 @@ NOTICE: tclsnitch: ddl_command_start DROP TABLE NOTICE: tclsnitch: ddl
Re: [HACKERS] Convert pltcl from strings to objects
On 2/29/16 9:57 AM, Tom Lane wrote: plpgsql already has a similar mechanism (see PLpgSQL_function.use_count) which you could probably copy. But I'd advise that this is a separate matter to be addressed in a separate patch; it has little to do with the nominal subject matter of this patch. Ahh, thanks for pointing that out. Completely agree on it being a separate patch. Flight Aware is a big pltcl user as well as a contributor to the TCL community, so there's several more patches in the works. This would be one of them. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve error handling in pltcl
On 2/28/16 5:50 PM, Jim Nasby wrote: Per discussion in [1], this patch improves error reporting in pltcl. I forgot to mention that this work is sponsored by Flight Aware (http://flightaware.com). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dealing with extension dependencies that aren't quite 'e'
On 2/29/16 7:27 PM, Abhijit Menon-Sen wrote: 1. This adds the 'x'/DEPENDENCY_AUTO_EXTENSION type. 2. This adds an 'ALTER FUNCTION … ADD DEPENDENT FUNCTION …' command. I split up the two because we may want the new dependency type without going to the trouble of adding a new command. Maybe extension authors should just insert an 'x' row into pg_depend directly? I don't see why this would be limited to just functions. I could certainly see an extension that creates ease-of-use views that depend on the extension, or tables that have triggers that Am I missing something? I was inclined to implement it using ALTER FUNCTION, but AlterFunction() is focused on altering the pg_proc entry for a function, so the new code didn't fit. Ultimately, ExecAlterExtensionContentsStmt() was the closest match, so that's where I did it. Maybe the better way to handle this would be through ALTER EXTENSION? Given the audience for this, I think it'd probably be OK to just provide a function that does this, instead of DDL. I'd be concerned about asking users to do raw inserts though. pg_depends isn't the easiest thing to grok so I suspect there'd be a lot of problems with that, resulting in more raw DML to try and fix things, resulting in pg_depend getting completely screwed up... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dealing with extension dependencies that aren't quite 'e'
On 2/29/16 10:33 PM, Abhijit Menon-Sen wrote: >Given the audience for this, I think it'd probably be OK to just >provide a function that does this, instead of DDL. That seems like a promising idea. Can you suggest some possible usage? pg_extension_dependency( regextension, any ) where "any" would be all the other reg* types. That should be a lot less work to code up than messing with the grammar. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Upper planner pathification
On 2/28/16 4:02 PM, Andres Freund wrote: So, where to go from here? I'm acutely aware that we're hard up against >the final 9.6 commitfest, and that we discourage major patches arriving >so late in a devel cycle. But I simply couldn't get this done any faster. >I don't really want to hold it over for the 9.7 devel cycle. It's been >enough trouble maintaining this patch in the face of conflicting commits >over the last year or so (it's probably still got bugs related to parallel >query...), and there definitely are conflicting patches in the upcoming >'fest. And the lack of this infrastructure is blocking progress on FDWs >and some other things. > >So I'd really like to get this into 9.6. I'm happy to put it into the >March commitfest if someone will volunteer to review it. Hard. This is likely to cause/trigger a number of bugs, and we don't have much time to let this mature. It's a change that we're unlikely to be able to back-out if we discover that it wasn't the right thing to integrate shortly before the release. On the other hand, this is a major architectural step forward; one that unblocks a number of nice features. There's also an argument to be made that integrating this now is beneficial, because it'll cause less churn for patches being developed while 9.6 is stabilizing. Perhaps the best way to handle this would be to commit it to a branch sooner rather than later. If things work out, that branch can become the official beta. If not, in can become the basis for 9.7. If nothing else it means that Tom isn't the only one stuck trying to maintain this. Even if the branch is nothing but a means to generating a patch for 9.7, having it in place makes it a lot easier for other developers that need to to code against it. While I'm promoting heresy... I imagine that this patch doesn't require a catversion bump. Perhaps it would be worth doing a short-cycle major release just to get this in. That might sound insane but since one of the biggest obstacles to upgrading remains dealing with the on-disk format, I don't think users would freak out about it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PROPOSAL: Fast temporary tables
On 3/1/16 10:05 AM, Atri Sharma wrote: Fair point, that means inventing a whole new OID generation structure.. Generation is just the tip of the iceberg. You still need the equivalent to foreign keys (ie: pg_depend). While you would never have a permanent object depend on a temp object, the reverse certainly needs to be supported. If I were attempting to solve this at a SQL level, I'd be thinking about using table inheritance such that the permanent objects are stored in a permanent parent. New backends would create UNLOGGED children off of that parent. There would be a pid column that was always NULL in the parent, but populated in children. That means children could use their own local form of an OID. When a backend terminates you'd just truncate all it's tables. Actually translating that into relcache and everything else would be a serious amount of work. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Publish autovacuum informations
On 3/1/16 8:37 AM, Julien Rouhaud wrote: > >We understood (IMHO is an interesting idea) but as Michael said hooks is >for a general purpose. So can you demonstrate other use cases for this >new hooks? > I can think of several usage. First, since the hook will always be called, an extension will see all the activity a worker is doing when exposing private structure will always be some kind of sampling. Then, I think that's pretty key. If you wanted to create an extension that logs vacuums (which would be great, since current state of the art is logs + pgBadger), you'd want to gather your data about what the vacuum did as the vacuum was ending. I can certainly see cases where you don't care about that and just want what's in shared memory, but that would only be useful for monitoring what's happening real-time, not for knowing what final results are. BTW, I think as much of this as possible should also work for regular vacuums. you can have other information that wouldn't be available just by exposing private structure. For instance knowing a VACUUM isn't performed by the worker (either because another worker is already working on it or because it isn't needed anymore). IIRC there was a discussion about concurrency issue in this case. We can also know if the maintenance was cancelled due to lock not obtained fast enough. Finally, as long as the hooks aren't use, they don't have any overhead. I agree that all this is for monitoring purpose. I'm not sure what are the fancy things that Michael had in mind with exposing the private structure. Michael, was it something like having the ability to change some of these data through an extension? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve error handling in pltcl
On 2/29/16 10:01 PM, Tom Lane wrote: Jim Nasby writes: On 2/28/16 5:50 PM, Jim Nasby wrote: Per discussion in [1], this patch improves error reporting in pltcl. I forgot to mention that this work is sponsored by Flight Aware (http://flightaware.com). Huh ... I use that site. There's PG and pltcl code behind it? Cool! Heh, I didn't realize you were a TCL fan. They've been heavy PG users from the start. Eventually PG had trouble keeping up with the in-flight tracking so they created Speed Tables [1]. And Karl (one of the founders) is a well known TCL contributor[2]. When it comes to sheer geek factor though, I think the multilateration[3] stuff they're doing with their ADS-B network is a really cool data application. It's basically a form of "reverse GPS" for tracking aircraft. [1] https://github.com/flightaware/speedtables [2] http://wiki.tcl.tk/83 [3] http://flightaware.com/adsb/mlat/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Publish autovacuum informations
On 3/1/16 3:02 PM, Julien Rouhaud wrote: You mean for database wide vacuum? I mean manual vacuum. Some hooks and stats would apply only to autovac obviously (and it'd be nice to get visibility into the scheduling decisions both daemons are making). But as much as possible things should be done in vacuum.c/lazyvacuum.c so it works for manual vacuums as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types
On 3/2/16 3:52 PM, Pavel Stehule wrote: Right, and it's arguably dubious that that doesn't already work. Unfortunately, these % things are just random plpgsql parser hacks, not real types. Maybe this should be done in the main PostgreSQL parser with parameter hooks, if we wanted this feature to be available outside plpgsql as well. I am not fan to propagate this feature outside PLpgSQL - it is possible new dependency between database object, and the cost is higher than benefits. I fail to see how it'd be a dependency. I'd expect it to look up the type when you run the command, just like plpgsql does. I think it'd be useful to have. That said, I think that should be a completely separate patch and discussion. Lets at least get it into plpgsql first. As for the array of element/element of array feature; I agree it would be nice, but we're pretty late in the game for that, and I don't see why that couldn't be added later. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 3/2/16 4:21 PM, Peter Geoghegan wrote: I think you should commit this. The chances of anyone other than you and Masahiko recalling that you developed this tool in 3 years is essentially nil. I think that the cost of committing a developer-level debugging tool like this is very low. Modules like pg_freespacemap currently already have no chance of being of use to ordinary users. All you need to do is restrict the functions to throw an error when called by non-superusers, out of caution. It's a problem that modules like pg_stat_statements and pg_freespacemap are currently lumped together in the documentation, but we all know that. +1. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert pltcl from strings to objects
On 3/2/16 12:32 PM, Tom Lane wrote: Jim Nasby writes: [ pltcl_objects_2.patch ] I've pushed this with some minor fixes, as well as the followup work mentioned in this thread. Awesome, thanks! I've asked Karl's opinion on increasing the minimum TCL version, but I suspect that won't be an issue. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert pltcl from strings to objects
On 3/1/16 5:06 PM, Tom Lane wrote: If we don't do that, I'm at least going to put in a similar #error for Tcl 8.0; but I really think we ought to just say 8.4 is the minimum. Just confirmed that should be completely reasonable. I'll take a look at it in a few days if you don't beat me to it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Publish autovacuum informations
On 3/2/16 10:48 AM, Julien Rouhaud wrote: Good point, I don't see a lot of information available with this hooks that a native system statistics couldn't offer. To have the same amount of information, I think we'd need a pg_stat_autovacuum view that shows a realtime insight of the workers, and also add some aggregated counters to PgStat_StatTabEntry. I wonder if adding counters to PgStat_StatTabEntry would be accepted though. I would also really like to see a means of logging (auto)vacuum activity in the database itself. We figured out how to do that with pg_stat_statements, which was a lot harder... it seems kinda silly not to offer that for vacuum. Hooks plus shared memory data should allow for that (the only tricky bit is the hook would need to start and then commit a transaction, but that doesn't seem onerous). I think the shared memory structures should be done as well. Having that real-time info is also valuable. I don't see too much point in adding stuff to the stats system for this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 3/2/16 5:41 PM, Tom Lane wrote: Jim Nasby writes: On 3/2/16 4:21 PM, Peter Geoghegan wrote: I think you should commit this. The chances of anyone other than you and Masahiko recalling that you developed this tool in 3 years is essentially nil. I think that the cost of committing a developer-level debugging tool like this is very low. Modules like pg_freespacemap currently already have no chance of being of use to ordinary users. All you need to do is restrict the functions to throw an error when called by non-superusers, out of caution. It's a problem that modules like pg_stat_statements and pg_freespacemap are currently lumped together in the documentation, but we all know that. +1. Would it make any sense to stick it under src/test/modules/ instead of contrib/ ? That would help make it clear that it's a debugging tool and not something we expect end users to use. I haven't looked at it in detail; is there something inherently dangerous about it? When I'm forced to wear a DBA hat, I'd really love to be able to find out what VM status for a large table is. If it's in contrib they'll know the tool is there; if it's under src then there's about 0 chance of that. I'd think SU-only and any appropriate warnings would be enough heads-up for DBAs to be careful with it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb array-style subscription
On 3/2/16 6:24 PM, Tom Lane wrote: If the patch were proposing a similar amount of new infrastructure to support some datatype-extensible concept of subscripting, I'd be much happier about it. +1 I believe there's been some handwaving in the past about extensible approaches to subscripting, though I haven't got time to troll the archives for it right now. I'd be able to make use of that in my ndarray data type. It would also be nice to be able to add things like matrix types, sparse arrays, and variable size arrays (ie: list of lists), and subscripting is how you'd want to interface with all of those. Presumably the point type is handled specially today, so that should be taken care off too. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.
On 3/10/16 8:36 PM, Robert Haas wrote: 1. We make it true only for heavyweight lock waits, and false for other kinds of waits. That's pretty strange. 2. We make it true for all kinds of waits that we now know how to report. That still breaks compatibility. I would absolutely vote for 2 here. You could even argue that it's a bug fix, since those were waits we technically should have been indicating. The only way I can see #2 breaking anything is if you're using waiting=true to determine whether you look at pg_locks and your code will blow up if you get no rows back, but that seems like a pretty limited use case to me (Hello, LEFT JOIN). Dropping the column entirely though would break tons of things. Another random thought... changes like this would probably be easier to handle if we provided backwards compatibility extensions that created views that mimicked the catalog for a specific Postgres version. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] amcheck (B-Tree integrity checking tool)
On 3/11/16 3:31 PM, Peter Geoghegan wrote: Can we come up with names that more clearly identify the difference >between those two functions? I mean,_parent_ does not make it >particularly obvious that the second function acquires exclusive lock >and performs more thorough checks. Dunno about that. It's defining characteristic is that it checks child pages against their parent IMV. Things are not often defined in terms of their locking requirements. First, thanks for your work on this. I've wanted it in the past. I agree the name isn't very clear. Perhaps _recurse? I also agree that the nmodule name isn't very clear. If this is meant to be the start of a generic consistency checker, lets call it that. Otherwise, it should be marked as being specific to btrees, because presumably we might eventually want similar tools for GIN, etc. (FWIW I'd vote for a general consistency checker). I know the vacuum race condition would be very rare, but I don't think it can be ignored. Last thing you want out of a consistency checker is false negatives/positives. I do think it would be reasonable to just wholesale block against concurrent vacuums, but I don't think there's any reasonable way to do that. I would prefer the ability to do something other than raising an error when corruption is found, so that you could find all corruption in an index. Obviously could log to a different level. Another option would be SRFs that return info about all the corruption found, but that's probably overkill. It'd be nice if you had the option to obey vacuum_cost_delay when running this, but that's clearly just a nice-to-have (or maybe just obey it all the time, since it defaults to 0). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On 3/11/16 5:14 PM, Petr Jelinek wrote: I don't really understand this part about concurrent DDL. If there were concurrent DDL going on, presumably other backends would be blocked on the relation lock, not the relation extension lock - and it doesn't seem likely that you'd often have a huge pile-up of inserters waiting on concurrent DDL. But I guess it could happen. Yeah I was thinking about the latter part and as I said it's very rare case, but I did see something similar couple of times in the wild. It's not objection against committing this patch though, in fact I think it can be committed as is. FWIW, this is definitely a real possibility in any shop that has very high downtime costs and high transaction rates. I also think some kind of clamp is a good idea. It's not that uncommon to run max_connections significantly higher than 100, so the extension could be way larger than 16MB. In those cases this patch could actually make things far worse as everyone backs up waiting on the OS to extend many MB when all you actually needed were a couple dozen more pages. BTW, how was *20 arrived at? ISTM that if you have a lot of concurrent demand for extension that means you're running lots of small DML operations, not really big ones. I'd think that would make *1 more appropriate. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] amcheck (B-Tree integrity checking tool)
On 3/11/16 6:17 PM, Peter Geoghegan wrote: Not sure about the cost delay thing. Delays are disabled by default for manually issued VACUUM, so have doubts that that's useful. Right, but you still have the option to enable them if you don't want to swamp your IO system. That's why CIC obeys it too. If I was running a consistency check on a production system I'd certainly want the option to throttle it. Without that option, I don't see running this on production systems as being an option. If that's not a goal then fine, but if it is a goal I think it needs to be there. Isn't it just a few extra lines of code to support it? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there a way around function search_path killing SQL function inlining? - and backup / restore issue
On 3/10/16 3:29 PM, Regina Obe wrote: Take for example, I have tiger geocoder which relies on fuzzystrmatch. I have no idea where someone installs fuzzystrmatch so I can't schema qualify those calls. I use that dependent function to use to build an index on tables. This is something I've thought about as well, and I think the real problem is search_path just isn't the right way to handle this. I think there needs to be some way to definitively reference something that's part of an extension; a method that doesn't depend on whatever schema the extension happens to be installed in. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
On 3/10/16 7:48 AM, Robert Haas wrote: I think the problem is that you can't show the name of a non-global SQL object (such as a relation) unless the object is in the current database. Many of the views in the first group are database-local views, while things like pg_locks span all databases. We can show the datid/relid always, but if we have a relname column it will have to be NULL unless the datid is our database. I would prefer that if the object is in another database we at least display the OID. That way, if you're logging this info you can go back later and figure out what was going on. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve error handling in pltcl
On 3/3/16 8:51 AM, Pavel Stehule wrote: Hi I am testing behave, and some results looks strange Thanks for the review! postgres=# \sf foo CREATE OR REPLACE FUNCTION public.foo() RETURNS void LANGUAGE plpgsql AS $function$ begin raise exception sqlstate 'ZZ666' using message='hello, world', detail='hello, my world', hint = 'dont afraid'; end $function$ postgres=# select tcl_eval('spi_exec "select foo();"'); ERROR: 38000: hello, world CONTEXT: hello, world <<<<==??? the message was in context. Probably it is out of scope of this patch, but it isn't consistent with other PL while executing "spi_exec "select foo();"" ("eval" body line 1) invoked from within "eval $1" (procedure "__PLTcl_proc_16864" line 3) invoked from within "__PLTcl_proc_16864 {spi_exec "select foo();"}" in PL/Tcl function "tcl_eval" LOCATION: throw_tcl_error, pltcl.c:1217 Time: 1.178 ms Both problems actually exists in HEAD. The issue is this line in throw_tcl_error: econtext = utf_u2e(Tcl_GetVar(interp, "errorInfo", TCL_GLOBAL_ONLY)); Offhand I don't see any great way to improve that behavior, and in any case it seems out of scope for this patch. As a workaround I'm just forcing psql error VERBOSITY to terse for now. postgres=# select tcl_eval('join $::errorCode "\n"'); tcl_eval ═ POSTGRES ↵ message↵ hello, world ↵ detail ↵ hello, my world↵ hint ↵ dont afraid↵ domain ↵ plpgsql-9.6↵ context_domain ↵ postgres-9.6 ↵ context↵ PL/pgSQL function foo() line 3 at RAISE↵ SQL statement "select foo();" ↵ cursor_position↵ 0 ↵ filename ↵ pl_exec.c ↵ lineno ↵ 3165 ↵ funcname ↵ exec_stmt_raise (1 row) I miss a SQLSTATE. Great catch. Fixed. Why is used List object instead dictionary? TCL supports it https://www.tcl.tk/man/tcl8.5/tutorial/Tcl23a.html Because errorCode unfortunately is an array and not a dict. It doesn't really seem worth messing with it in the eval since this is just a sanity check... New patch attached. It also removes some other unstable output from the regression test. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/pltcl.sgml b/doc/src/sgml/pltcl.sgml index d2175d5..d5c576d 100644 --- a/doc/src/sgml/pltcl.sgml +++ b/doc/src/sgml/pltcl.sgml @@ -775,6 +775,127 @@ CREATE EVENT TRIGGER tcl_a_snitch ON ddl_command_start EXECUTE PROCEDURE tclsnit + +Error Handling in PL/Tcl + + + error handling + in PL/Tcl + + + + All Tcl errors that are allowed to propagate back to the top level of the + interpreter, that is, errors not caught within the stored procedure + using the Tcl catch command will raise a database + error. + + + Tcl code within or called from the stored procedure can choose to + raise a database error by invoking the elog + command provided by PL/Tcl or by generating an error using the Tcl + error command and not catching it with Tcl's + catch command. + + + Database errors that occur from the PL/Tcl stored procedure's + use of spi_exec, spi_prepare, + and spi_execp are also catchable by Tcl's + catch command. + + + Tcl provides an errorCode variable that can + represent additional information about the error in a form that + is easy for programs to interpret. The contents are in Tcl list + format and the first word identifies the subsystem or + library responsible for the error and beyond that the contents are left + to the individual code or library. For example if Tcl's + open command is asked to open a file that doesn't + exist, errorCode + might contain POSIX ENOENT {no such file or directory} + where the third element may vary by locale but the first and second + will not. + + + When spi_exec, spi_prepare + or spi_execp cause a database error to be raised, + that database eror propagates back to Tcl as a Tcl err
Re: [HACKERS] Sanity checking for ./configure options?
On 2/26/16 9:29 PM, Peter Eisentraut wrote: Your code and comments suggest that you can specify the port to configure by setting PGPORT, but that is not the case. test == is not portable (bashism). Error messages should have consistent capitalization. Indentation in configure is two spaces. >As the comment states, it doesn't catch things like --with-pgport=1a in >configure, but the compile error you get with that isn't too hard to >figure out, so I think it's OK. Passing a non-integer as argument will produce an error message like (depending on shell) ./configure: line 3107: test: 11a: integer expression expected but will not actually abort configure. It would work more robustly if you did something like this elif test "$default_port" -ge "1" -a "$default_port" -le "65535"; then : else AC_MSG_ERROR([port must be between 1 and 65535]) fi but that still leaks the shell's error message. There is also the risk of someone specifying a number with a leading zero, which C would interpret as octal but the shell would not. All issues should now be addressed. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/configure b/configure index b3f3abe..e7bddba 100755 --- a/configure +++ b/configure @@ -3099,6 +3099,16 @@ cat >>confdefs.h <<_ACEOF _ACEOF +# It's worth testing for this because it creates a very confusing error +if test "$default_port" = ""; then + as_fn_error $? "invalid empty string supplied with --with-pgport" "$LINENO" 5 +elif test ! `echo $default_port | sed -e 's/[0-9]//g'` = ''; then + as_fn_error $? "invalid port specification; must be a number" "$LINENO" 5 +elif test ! `echo $default_port | sed -e 's/^0//g'` = $default_port; then + as_fn_error $? "illegal leading 0 specified with --with-pgport" "$LINENO" 5 +elif test "$default_port" -lt "1" -o "$default_port" -gt "65535"; then + as_fn_error $? "port must be between 1 and 65535" "$LINENO" 5 +fi # # '-rpath'-like feature can be disabled diff --git a/configure.in b/configure.in index 0bd90d7..db6e2a0 100644 --- a/configure.in +++ b/configure.in @@ -164,6 +164,16 @@ but it's convenient if your clients have the right default compiled in. AC_DEFINE_UNQUOTED(DEF_PGPORT_STR, "${default_port}", [Define to the default TCP port number as a string constant.]) AC_SUBST(default_port) +# It's worth testing for this because it creates a very confusing error +if test "$default_port" = ""; then + AC_MSG_ERROR([invalid empty string supplied with --with-pgport]) +elif test ! `echo $default_port | sed -e 's/[[0-9]]//g'` = ''; then + AC_MSG_ERROR([invalid port specification; must be a number]) +elif test ! `echo $default_port | sed -e 's/^0//g'` = $default_port; then + AC_MSG_ERROR([illegal leading 0 specified with --with-pgport]) +elif test "$default_port" -lt "1" -o "$default_port" -gt "65535"; then + AC_MSG_ERROR([port must be between 1 and 65535]) +fi # # '-rpath'-like feature can be disabled -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql metaqueries with \gexec
On 2/22/16 1:01 PM, Corey Huinker wrote: In the mean time, update patch attached. Really attached this time. I'm getting a warning from this patch: common.c:947:8: warning: variable 'success' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (pset.gexec_flag) ^~~ common.c:995:9: note: uninitialized use occurs here return success; ^~~ common.c:947:4: note: remove the 'if' if its condition is always false if (pset.gexec_flag) ^~~~ common.c:937:15: note: initialize the variable 'success' to silence this warning boolsuccess; ^ = '\0' 1 warning generated. (note that I'm using CC='ccache clang -Qunused-arguments -fcolor-diagnostics') for (r = 0; r < nrows; r++) { for (c = 0; c < ncolumns; c++) { etc... Normally we don't use gratuitous {'s, and I don't think it's helping anything in this case. But I'll let whoever commits this decide. diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 5f27120..0f87f29 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -1280,8 +1280,8 @@ psql_completion(const char *text, int start, int end) "\\dm", "\\dn", "\\do", "\\dO", "\\dp", "\\drds", "\\ds", "\\dS", "\\dt", "\\dT", "\\dv", "\\du", "\\dx", "\\dy", "\\e", "\\echo", "\\ef", "\\encoding", "\\ev", - "\\f", "\\g", "\\gset", "\\h", "\\help", "\\H", "\\i", "\\ir", "\\l", - "\\lo_import", "\\lo_export", "\\lo_list", "\\lo_unlink", + "\\f", "\\g", "\\gexec", "\\gset", "\\h", "\\help", "\\H", "\\i", "\\ir", + "\\l", "\\lo_import", "\\lo_export", "\\lo_list", "\\lo_unlink", FWIW, it's generally better to leave that kind of re-wrapping to the next pg_indent run. I added tests for ON_ERROR_STOP. New patch attached. The patch still needs to document this feature in the psql docs (and maybe the manpage? not sure how that's generated...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 9750a5b..5ca769f 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -849,6 +849,13 @@ exec_command(const char *cmd, status = PSQL_CMD_SEND; } + /* \gexec -- send query and treat every result cell as a query to be executed */ + else if (strcmp(cmd, "gexec") == 0) + { + pset.gexec_flag = true; + status = PSQL_CMD_SEND; + } + /* \gset [prefix] -- send query and store result into variables */ else if (strcmp(cmd, "gset") == 0) { diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 2cb2e9b..54b7790 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -710,6 +710,46 @@ StoreQueryTuple(const PGresult *result) return success; } +/* + * ExecQueryTuples: assuming query result is OK, execute every query + * result as its own statement + * + * Returns true if successful, false otherwise. + */ +static bool +ExecQueryTuples(const PGresult *result) +{ + boolsuccess = true; + int nrows = PQntuples(result); + int ncolumns = PQnfields(result); + int r, c; + + for (r = 0; r < nrows; r++) + { + for (c = 0; c < ncolumns; c++) + { + if (! PQgetisnull(result, r, c)) + { + if ( ! SendQuery(PQgetvalue(result, r, c)) ) + { + if (pset.on_error_stop) + { + return false; + } + else + { + success = false; +
Re: [HACKERS] psql metaqueries with \gexec
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested Still needs documentation. The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] amcheck (B-Tree integrity checking tool)
On 3/11/16 6:45 PM, Peter Geoghegan wrote: I'll add that if people like the interface you propose. (Overloading the VACUUM cost delay functions to cause a delay for amcheck functions, too). I thought that had already been overloaded by CIC, but I'm not finding reference to it... ANALYZE does use it though, so the ship has already sorta sailed. I'm actually a bit surprised cost delay isn't used anywhere else. As more background operations are added I suspect users will want it at some point. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 3/13/16 12:48 AM, Pavel Stehule wrote: crosstabview is really visualization tool. **But now, there are not any other tool available from terminal.** So this can be significant help to all people who would to use this functionality. Not just the terminal either. Offhand I'm not aware of *any* fairly simple tool that provides crosstab. There's a bunch of complicated/expensive BI tools that do, but unless you've gone through the trouble of getting one of those setup you're currently pretty stuck. Ultimately I'd really like some way to remove/reduce the restriction of result set definitions needing to be determined at plan time. That would open the door for server-side crosstab/pivot as well a a host of other things (such as dynamically turning a hstore/json/xml field into a recordset). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On 3/11/16 9:57 PM, Petr Jelinek wrote: I also think some kind of clamp is a good idea. It's not that uncommon to run max_connections significantly higher than 100, so the extension could be way larger than 16MB. In those cases this patch could actually make things far worse as everyone backs up waiting on the OS to extend many MB when all you actually needed were a couple dozen more pages. I agree, We can have some max limit on number of extra pages, What other thinks ? Well, that's what I meant with clamping originally. I don't know what is a good value though. Well, 16MB is 2K pages, which is what you'd get if 100 connections were all blocked and we're doing 20 pages per waiter. That seems like a really extreme scenario, so maybe 4MB is a good compromise. That's unlikely to be hit in most cases, unlikely to put a ton of stress on IO, even with magnetic media (assuming the whole 4MB is queued to write in one shot...). 4MB would still reduce the number of locks by 500x. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] checkpointer continuous flushing - V18
On 3/13/16 6:30 PM, Peter Geoghegan wrote: On Sat, Mar 12, 2016 at 5:21 PM, Jeff Janes wrote: Would the wiki be a good place for such tips? Not as formal as the documentation, and more centralized (and editable) than a collection of blog posts. That general direction makes sense, but I'm not sure if the Wiki is something that this will work for. I fear that it could become something like the TODO list page: a page that contains theoretically accurate information, but isn't very helpful. The TODO list needs to be heavily pruned, but that seems like something that will never happen. A centralized location for performance tips will probably only work well if there are still high standards that are actively enforced. There still needs to be tight editorial control. I think there's ways to significantly restrict who can edit a page, so this could probably still be done via the wiki. IMO we should also be encouraging users to test various tips and provide feedback, so maybe a wiki page with a big fat request at the top asking users to submit any feedback about the page to -performance. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Publish autovacuum informations
On 3/3/16 3:54 AM, Kyotaro HORIGUCHI wrote: I wonder why there haven't been discussions so far on what kind of information we want by this feature. For example I'd be happy to see the time of last autovacuum trial and the cause if it has been skipped for every table. Such information would (maybe) naturally be shown in pg_stat_*_tables. = =# select relid, last_completed_autovacuum, last_completed_autovacv_status, last_autovacuum_trial, last_autovacuum_result from pg_stat_user_tables; -[ RECORD 1 ]-+-- relid | 16390 last_completed_autovacuum | 2016-03-01 01:25:00.349074+09 last_completed_autovac_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages last_autovacuum_trial | 2016-03-03 17:33:04.004322+09 last_autovac_traial_status| Canceled by PID 2355. Processed 144/553 pages. -[ RECORD 2 ]--+-- ... last_autovacuum_trial | 2016-03-03 07:25:00.349074+09 last_autovac_traial_status| Completed in 4 seconds. Scanned 434 pages, skipped 23 pages -[ RECORD 3 ]--+-- ... last_autovacuum_trial | 2016-03-03 17:59:12.324454+09 last_autovac_trial_status | Processing by PID 42334, 564 / 32526 pages done. -[ RECORD 4 ]--+-- ... last_autovacuum_trial | 2016-03-03 17:59:12.324454+09 last_autovac_trial_status | Skipped by dead-tuple threashold. = I kinda like where you're going here, but I certainly don't think the stats system is the way to do it. Stats bloat is already a problem on bigger systems. More important, I don't think having just the last result is very useful. If you've got a vacuum problem, you want to see history, especially history of the vacuum runs themselves. The good news is that vacuum is a very low-frequency operation, so it has none of the concerns that the generic stats system does. I think it would be reasonable to provide event triggers that fire on every launcher loop, after a worker has built it's "TODO list", and after every (auto)vacuum. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types
On 3/3/16 4:51 AM, Pavel Stehule wrote: CREATE TABLE a(a int); CREATE TABLE b(a a.a%TYPE) And the people expecting the living relation between table a and table b. So when I do ALTER a.a, then b.a should be changed. What if I drop a.a or drop a? So this is reason, why I don't would this feature in SQL side. I don't buy that. plpgsql doesn't work that way, so why would this? *especially* with the %TYPE decorator. Now, if the syntax was CREATE TABLE b(a a.a) then I would expect b.a to be a foreign key reference to a. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor typos in optimizer/README
Studying the partification commit, I noticed a few typos in $SUBJECT. Patch attached. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README index 7ecf8c8..9529346 100644 --- a/src/backend/optimizer/README +++ b/src/backend/optimizer/README @@ -900,8 +900,8 @@ above, plus a relids set, which allows there to be more than one upperrel of the same kind. We use NULL for the relids if there's no need for more than one upperrel of the same kind. Currently, in fact, the relids set is vestigial because it's always NULL, but that's expected to change in -future. For example, in planning set operations, we might need the relids -to denote which subset of the leaf SELECTs has been combined in a +the future. For example, in planning set operations, we might need the +relids to denote which subset of the leaf SELECTs has been combined in a particular group of Paths that are competing with each other. The result of subquery_planner() is always returned as a set of Paths @@ -971,5 +971,5 @@ One of the keys to making parallel query effective is to run as much of the query in parallel as possible. Therefore, we expect it to generally be desirable to postpone the Gather stage until as near to the top of the plan as possible. Expanding the range of cases in which more work can be -pushed below the Gather (and costly them accurately) is likely to keep us +pushed below the Gather (and costing them accurately) is likely to keep us busy for a long time to come. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types
On 3/15/16 7:17 PM, Tom Lane wrote: In short, I think we should reject this implementation and instead try to implement the type operators we want in the core grammar's Typename production, from which plpgsql will pick it up automatically. +1. Something else that's been discussed is allowing [] referencing to be more modular. Offhand I don't see how that would impact this new type referencing stuff, but maybe someone else sees an issue. BTW, it might also be useful to allow {} to work as a reference method. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fd.c doesn't remove files on a crash-restart
On 3/16/16 2:16 PM, Tom Lane wrote: Robert Haas writes: On Wed, Mar 16, 2016 at 2:05 PM, Tom Lane wrote: Possible compromise: remove files only in non-Assert builds? That sorta seems like tying two things together that aren't obviously related. I think building with --enable-cassert is support to enable debugging cross-checks, not change behavior. Well, it's support to enable debugging, and I would classify not destroying evidence as being debugging support. Another option: keep stuff around for a single restart. I don't think this would be that hard by having a file that's a list of files to remove on the next restart. On restart, remove everything in that file (and the file itself). If there's anything left, create a new file that's the list of what's left. The other nice thing about having this list is it would tell the DBA exactly what files were left after the crash vs what's new. Actually, I guess another option would be to have a separate directory to move all these files into. On restart, nuke the directory if it exists, then move stuff in there if necessary. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] typmod is always -1
On 3/17/16 7:40 PM, Tom Lane wrote: Chapman Flack writes: It seems that a typmod can only be used restrict the set of possible values of the unmodified type (as clearly seen in the language "length conversion cast", since certainly a typmod allowing { string | length < N } is doing nothing but enforcing a subset of { string }. Each element of the subset is still a valid element of the whole set (naturally, boring) *and has to be represented the same way* (interesting): the representation mustn't do clever things that you would need to know the typmod in order to interpret, because most uses of a value are without access to the typmod. You do need to be able to interpret values of the type without having separate access to the typmod, but I don't think it follows that it's as restrictive as you say. One easy way around that is to store the typmod in the value. Practical uses might include compressing the data in different ways depending on typmod. I'm drawing a blank on other compelling examples though I'm sure there are some. Have you looked at PostGIS? I'm pretty sure some of their types make use of typmod in nontrivial ways. If you want a non-trivial use of typmod, take a look at the (work in progress) variant type I created[1]. It allows you pass names of "registered variants" in via typmod. The idea behind that is to restrict what types you can actually store in a particular variant field (though you can also disallow a registered variant from being used in a table definition). I did run into some cases where Postgres ignored typmod, so I special case the default typmod (-1) to a registered variant that's disabled. [1] https://github.com/BlueTreble/variant/blob/master/doc/variant.md#variant-modifier -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oldest xmin is far in the past
On 3/19/16 11:32 AM, Tomas Vondra wrote: Hi, On 03/19/2016 06:29 AM, John Snow wrote: There is no any long transaction neither prepared transaction. Can you show us pg_stat_activity? Particularly the xmin values for backends attached to the two databases mentioned in the log (1 and 12451). FWIW the second OID is a bit weird - the first OID assigned to normal objects is defined as 16384, and none of the so I wonder how you managed to create a database with such DB? On my 9.4, template1 has oid 1. BTW, John mentioned Slony; if this is on one of the replicas then it's certainly understandable that all the tables have ages that are almost identical. That happens because the initial COPY of each table takes place in a single transaction, and the only other activity that's generating XIDs is the normal replay process. Depending on your settings, I'd expect that you're only generating a couple XIDs/minute, so even if it took 10 days to do the initial copy you'd still only have a span of ~30k transactions. That means autovac will suddenly want to freeze the whole database in one shot. It's a good idea to run a manual vacuum freeze after the initial copy is done to prevent this. To answer one of your other questions, it look like all the ages are ~500M XIDs, which means you've got another ~1B to go before this becomes a serious concern. * freeze_min_age * vacuum_freeze_min_age * autovacuum_freeze_max_age (we already know this one) What values are set for those? Better yet, can you just run this query? SELECT name, setting, unit, source FROM pg_settings WHERE name ~ 'freeze|vacuum' OR source !~ 'default|override' ; -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Upper planner pathification
On 3/17/16 9:01 AM, Robert Haas wrote: I think that there are an awful lot of cases where extension authors haven't been able to quite do what they want to do without core changes because they couldn't get control in quite the right place; or they could do it but they had to cut-and-paste a lot of code. FWIW, I've certainly run into this at least once, maybe twice. The case I can think of offhand is doing function resolution with variant. I don't remember the details anymore, but my recollection is that to get what I needed I would have needed to copy huge swaths of the rewrite code. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve error handling in pltcl
On 3/17/16 5:46 PM, Tom Lane wrote: Pavel Stehule writes: I'll mark this patch as ready for commiters. I started to look at this patch. It seems to me that the format of the errorCode output is not terribly well designed. ... Maybe there's another way. I've not used Tcl in anger since around the turn of the century, so it's entirely likely that I'm missing something. But the proposed doc addition isn't showing me any really easy way to work with this data format, and I think that that's either a design fail or a docs fail, not something I should just accept as the best we can do. I asked Karl about this (since he's active in the TCL community and works with TCL every day), and his response was essentially: Tcl is all about flat lists of key value pairs. array set myArray $list sucks a flat list of key-value pairs into an array and vice versa set list [array get myArray] creates one. This is normal Tcl stuff. Getting the errorCode into an array is as easy as array set errorData [lrange $errorCode 1 end] Then you can do $errorData(detail), $errorData(message), etc. In fact keyed lists in TclX which are the inspiration for the approach to lists of alternating key-value pairs did it the way he suggested and that’s fallen by the wayside in favor of flat lists. There has been a formal proposal to add a -stride to lsearch to make lsearch efficient at searching the same flat lists of key-value pairs and I expect to see it in Tcl 8.7 or sooner. The doc example also makes me think that more effort should get expended on converting internalquery/internalpos to just be query/cursorpos. It seems unlikely to me that a Tcl function could ever see a case where the latter fields are useful directly. Is there docs or an example on how to handle that? I looked at the plpython stuff and I'm still really unclear on what exactly an internalquery is as opposed to regular context info? PLy_spi_exception_set simply exposes the raw internalquery and internalpos. Also, I'm curious as to why you think "domain" or "context_domain" is of any value to expose here. Tcl code is not going to have any access to the NLS infrastructure (if that's even been compiled) to do anything with those values. I'm not really sure what it's hurting to expose that, but I'll remove it. And I believe it may be a security violation for this code to expose "detail_log". The entire point of that field is it goes to the postmaster log and NOT anywhere where unprivileged clients can see it. Removed. Nitpickier stuff: * Docs example could use work: it should show how to do something useful *in Tcl code*, like maybe checking whether an error had a particular SQLSTATE. The example with dumping the whole list at the psql command line is basically useless, not to mention that it relies on a nonexistent "tcl_eval" function. (And I don't care Will work on an improved example. for the regression test case creating such a function ... isn't that a fine SQL-injection hole?) If it was taking external input, but it's not, and it saves from creating 2 separate functions (which you want to do to make sure the global errorCode is being set, and not a local copy). * I believe pltcl_construct_errorCode needs to do E2U encoding conversion for anything that could contain non-ASCII data, which is most of the non-fixed strings. Done. * Useless-looking hunk at pltcl.c:1610 Removed. * I think the unstable data you're griping about is the Tcl function's OID, not the PID. (I wonder whether we should make an effort to hide that in errorInfo. But if so it's a matter for a separate patch.) It's possible that someone would want to know the name of the constructed TCL function (and yeah, I think it's the OID not PID). I'll set this patch back to Waiting On Author. I believe it's well within reach of getting committed in this fest, but it needs more work. Interim patch attached (need to work on the docs). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/pltcl.sgml b/doc/src/sgml/pltcl.sgml index d2175d5..d5c576d 100644 --- a/doc/src/sgml/pltcl.sgml +++ b/doc/src/sgml/pltcl.sgml @@ -775,6 +775,127 @@ CREATE EVENT TRIGGER tcl_a_snitch ON ddl_command_start EXECUTE PROCEDURE tclsnit + +Error Handling in PL/Tcl + + + error handling + in PL/Tcl + + + + All Tcl errors that are allowed to propagate back to the top level of the + interpreter, that is, errors not caught within the stored procedure + using the Tcl catch command will raise a database + error. + + + Tcl code within or called from the stored procedure can choose to + raise a
Re: [HACKERS] Relax requirement for INTO with SELECT in pl/pgsql
On 3/21/16 5:03 PM, Merlin Moncure wrote: in Oracle, you'd simply do: LogIt('I did something'); It would be *great* if we could support that in plpgsql. I'm not sure what Oracle does for SELECT statements without INTO/BULK UPDATE. I'm not really inclined to care -- I'm really curious to see an argument where usage of PERFORM actually helps in some meaningful way. Notably, SELECT without INTO is accepted syntax, but fails only after running the query. I think that's pretty much stupid but it's fair to say I'm not inventing syntax, only disabling the error. I don't think it buys much at all. While we're on the subject, it'd be great if variable := SELECT ... worked too. I'm not sure what other databases do is relevant. They use other procedure languages than pl//sql (the biggest players are pl/psm and t-sql) which have a different set of rules in terms of passing variables in and out of queries. +1 -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
On 3/21/16 11:57 AM, Teodor Sigaev wrote: A and B are points of intersection of lines. So, box PBCAis a bounding box for points contained in 3-rd (see labeling above). For example X labeled point is not a descendace of child node with centroid C because it must be in branch of 1-st quad of parent node. So, each node (except root) will have a limitation in its quadrant. To transfer that limitation the traversalValue is used. Isn't this basically the same thing that the cube contrib module does? (Which has the added benefit of kNN-capable operators). If that's true then ISTM it'd be better to work on pulling cube's features into box? If it's not true, I'm still wondering if there's enough commonality here that we should pull cube into core... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
On 3/21/16 7:41 PM, Stas Kelvich wrote: While people tends to use machine learning and regressions models more and more it is interesting to have some general n-dim indexing with kNN, but I think it is different problem and should be solved in a different way. I think one of the issues here is it's not very clear to someone without a good amount of ML knowledge how these things relate. I hear "box' and 'cube' and think it's just a 2D vs 3D issue, and intarray isn't even on the radar. Maybe what's needed are actual vector and matrix types? In any case, if you've got a good reason why box and cube should stay separate then further discussion should happen in another thread. BTW, if you haven't seen it, take a look at http://madlib.apache.org/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On 3/22/16 9:36 AM, Amit Kapila wrote: > > Note, that we are doing it only when a transaction has less than equal to > > 64 sub transactions. > > So? > They should fall on one page, unless they are heavily interleaved as pointed by you. I think either subtransactions are present or not, this patch won't help for bigger transactions. FWIW, the use case that comes to mind here is the "upsert" example in the docs. AFAIK that's going to create a subtransaction every time it's called, regardless if whether it performs actual DML. I've used that in places that would probably have moderately high concurrency, and I suspect I'm not alone in that. That said, it wouldn't surprise me if plpgsql overhead swamps an effect this patch has, so perhaps it's a moot point. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve error handling in pltcl
On 3/20/16 8:42 PM, Jim Nasby wrote: The doc example also makes me think that more effort should get expended on converting internalquery/internalpos to just be query/cursorpos. It seems unlikely to me that a Tcl function could ever see a case where the latter fields are useful directly. Is there docs or an example on how to handle that? I looked at the plpython stuff and I'm still really unclear on what exactly an internalquery is as opposed to regular context info? PLy_spi_exception_set simply exposes the raw internalquery and internalpos. Anyone any pointers on this? I'm not sure I can finish the docs without knowing what we want to do here. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Show dropped users' backends in pg_stat_activity
On 3/22/16 10:35 PM, Kyotaro HORIGUCHI wrote: Even if we maintained some interlock for a backend's login role identity, >I hardly think it would be practical to e.g. lock during transient SET >ROLE or security-definer-function-call operations. So it's not like we >can let the permissions system assume that a role OID being inquired about >always matches a live entry in pg_authid. Even if blocking DROPs is not perfect for all cases, unconditionally allowing to DROP a role still doesn't seem proper behavior, especially for replication roles. And session logins seem to me to have enough reason to be treated differently than disguising as another role using SET ROLE or sec-definer. There's probably a way this could be handled, since DROP ROLE is presumably a very uncommon operation. Perhaps something as simple as keeping a single OID in shared memory for the role about to be dropped. That would serialize role drops, but I doubt that matters. The attached patch blocks DROP ROLE for roles that own active sessions, and on the other hand prevents a session from being activated if the login role is concurrently dropped. I think this is fine for now, but... what happens if you drop a role that's in use on a streaming replica? Does replay stall or do we just ignore it? There should probably be some doc changes to go with the patch too, no? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relax requirement for INTO with SELECT in pl/pgsql
On 3/22/16 8:37 AM, Merlin Moncure wrote: I afraid of useless and forgotten call of functions. But the risk is same >like PERFORM - so this is valid from one half. The PERFORM statement holds >special semantic, and it is interesting. I see your point here, but the cost of doing that far outweighs the risks. And I don't think the arbitrary standard of defining forcing the user to identify if the query should return data is a good way of identifying dead code. Not to mention that there's tons of other ways to introduce unintended inefficiencies. Off the top of my head, declaring variables that are never referenced and have no assignment is a big one. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Upper planner pathification
On 3/22/16 7:28 AM, Michael Paquier wrote: On Mon, Mar 21, 2016 at 7:55 AM, Jim Nasby wrote: On 3/17/16 9:01 AM, Robert Haas wrote: I think that there are an awful lot of cases where extension authors haven't been able to quite do what they want to do without core changes because they couldn't get control in quite the right place; or they could do it but they had to cut-and-paste a lot of code. FWIW, I've certainly run into this at least once, maybe twice. The case I can think of offhand is doing function resolution with variant. I don't remember the details anymore, but my recollection is that to get what I needed I would have needed to copy huge swaths of the rewrite code. Amen, I have been doing that a couple of days ago with some elog stuff. Any ideas on ways to address this? Adding more hooks in random places every time we stumble across something doesn't seem like a good method. One thing I've wondered about is making it easier to find specific constructs in a parsed query so that you can make specific modifications. I recall looking at that once and finding a roadblock (maybe a bunch of functions were static?) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
On 3/24/16 10:21 AM, Alexander Korotkov wrote: 1) It's a great feature many users dream about. Doesn't matter if it starts eating their data... 2) Patch is not very big. 3) Patch doesn't introduce significant infrastructural changes. It just change some well-isolated placed. It doesn't really matter how big the patch is, it's a question of "What did the patch fail to consider?". With something as complicated as the btree code, there's ample opportunities for missing things. (And FWIW, I'd argue that a 51kB patch is certainly not small, and a patch that is doing things in critical sections isn't terribly isolated). I do think this will be a great addition, but it's just too late to be adding this to 9.6. (BTW, I'm getting bounces from a.lebe...@postgrespro.ru, as well as postmaster@. I emailed i...@postgrespro.ru about this but never heard back.) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alter or rename enum value
On 3/24/16 2:00 PM, Matthias Kurz wrote: ALTER TYPE bogon DROP VALUE 'cat'; -- not implemented in 9.5 but should work in future ROLLBACK; Dropping a value is significantly harder because that value could be in use. I'm certain there's a really good reason adding new values isn't allowed inside of a transaction. It's probably documented in the code. To answer your question about "what goes into a release", there's really no process for that. What goes into a release is what someone was interested enough in to get community approval for the idea, write the patch, and shepard the patch through the review process. So if you want these features added, you need to either: do it yourself, convince someone else to do it for free, or pay someone to do it for you. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] avg,first,last,median in one query
On 3/24/16 9:00 AM, Konstantin Knizhnik wrote: But unfortunately it is not possible to calculate median is such way because percentile_disc is not compatible with OVER: I don't know if you could use cume_dist()[1] to do this, but even if you can't it probably wouldn't be hard to modify it to do what you need. [1] http://www.postgresql.org/docs/9.5/static/functions-window.html -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alter or rename enum value
On 3/24/16 10:27 PM, Tom Lane wrote: It's conceivable that we could do something like adding an "isdead" column to pg_enum and making enum_in reject new values that're marked isdead. But I can't see that we'd ever be able to support true removal of an enum value at reasonable cost. And I'm not really sure where the use-case argument is for working hard on it. I wonder if we could handle this by allowing foreign keys on enum columns back to pg_enum. Presumably that means we'd have to treat pg_enum as a regular table and not a catalog table. Due to locking concerns I don't think we'd want to put the FKs in place by default either. I've certainly heard people avoiding ENUMs because of their limitations, so it'd be nice if there was a way to lift them. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alter or rename enum value
On 3/25/16 2:22 PM, Gavin Flower wrote: I've certainly heard people avoiding ENUMs because of their limitations, so it'd be nice if there was a way to lift them. Well, I use Enums extensively in Java. However, I totally avoid using ENUMs in pg, due to their inflexibility! Possibly related to this, for a long time I've also wanted a way to better integrate FKs, probably via some kind of a pseudotype or maybe a special operator. The idea being that instead of manually specifying joins, you could treat a FK field in a table as a pointer and do things like: CREATE TABLE invoice(customer int NOT NULL REFERENCES(customer)); SELECT invoice.*, customer->first_name, customer->last_name, ... FROM invoice; If we had that capability, there would be less need for ENUMs. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve error handling in pltcl
On 3/25/16 3:11 PM, Tom Lane wrote: Jim Nasby writes: the data, we're making it unnecessarily hard. All we need is one more field in there, and you can simplify that to Ahh, nice. I think actually it's a simple point: there won't ever be a case where cursorpos is set here, because that's only used for top-level SQL syntax errors. Anything we are catching would be an internal-query error, so we might as well not confuse PL/Tcl users with the distinction but just report internalquery/internalpos as the statement and cursor position. PLy_spi_exception_set simply exposes the raw internalquery and internalpos. Right, because that's all that could be useful. Ahh, ok, finally I get it. It would be nice if the comments for ErrorData were clearer... it strikes me that this is not coding style we want to encourage. We should borrow the infrastructure plpgsql has for converting SQLSTATEs into condition names, so that that can be more like Yeah, Karl and I were just talking about that as we were finishing up the docs changes (ironically, as you were commiting this...). I ended up with a more realistic example that also demonstrates that you can refer to errorCode in a separate function if desired. That patch attached for posterity. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/pltcl.sgml b/doc/src/sgml/pltcl.sgml index d2175d5..4cf32df 100644 --- a/doc/src/sgml/pltcl.sgml +++ b/doc/src/sgml/pltcl.sgml @@ -775,6 +775,169 @@ CREATE EVENT TRIGGER tcl_a_snitch ON ddl_command_start EXECUTE PROCEDURE tclsnit + +Error Handling in PL/Tcl + + + error handling + in PL/Tcl + + + + All Tcl errors that occur within a stored procedure and are not caught + using Tcl's catch or try + functions will raise a database error. + + + Tcl code can raise a database error by invoking the + elog command provided by PL/Tcl or by generating an + error using the Tcl error command and not catching it + with Tcl's catch command. + + + Database errors that occur from the PL/Tcl stored procedure's + use of spi_exec, spi_prepare, + and spi_execp are also catchable by Tcl's + catch command. + + + Tcl provides an errorCode variable that can represent + additional information about the error in a form that is easy for programs + to interpret. The contents are a Tcl list format. The first word + identifies the subsystem or library responsible for the error. The + remaining contents are up to the individual code or library. For example + if Tcl's open command is asked to open a file that + doesn't exist, errorCode might contain POSIX + ENOENT {no such file or directory} where the third element may + vary by locale but the first and second will not. + + + When spi_exec, spi_prepare + or spi_execp cause a database error to be raised, + that database eror propagates back to Tcl as a Tcl error. In this case + errorCode is set to a list where the first element is + POSTGRES followed by details of the Postgres error. + Since fields in the structure may or may not be present depending on the + nature of the error, how the function was invoked, etc, PL/Tcl has adopted + the convention that subsequent elements of the + errorCode list are key-value pairs where the first + value is the name of the field and the second is its value. + + + Fields that may be present include SQLSTATE, + message, + detail, + hint, + context, + schema, + table, + column, + datatype, + constraint, + cursor_position, + internalquery, + internal_position, + filename, + lineno and + funcname. + + + You might find it useful to load the results into an array. Code + for doing that might look like + +if {[lindex $errorCode 0] == "POSTGRES"} { +array set errorRow [lrange $errorCode 1 end] +} + + + + This example shows how to trap a specific SQL error. + +CREATE TABLE account(user_name varchar(1) NOT NULL PRIMARY KEY); +CREATE OR REPLACE FUNCTION public.create_user(user_name text) + RETURNS void LANGUAGE pltcl AS $function$ +set prep [ spi_prepare "INSERT INTO account(user_name) VALUES(\$1)" [ list text ] ] +if [ catch { +spi_execp $prep [ list $1 ] +} msg ] { +if {[lindex $::errorCode 0] == "POSTGRES"} { +array set errorData [lrange $::errorCode 1 end] +if { $errorData(SQLSTATE) == "23505" && $errorData(constraint) == "account_pkey" } { +return -code error "user '$1' already exists" +} +} +throw $::errorCode $msg +} +$function$; +
[HACKERS] SQL access to access method details
While working on a tool to capture catalog/stats info and looking at cross version compatibility, I noticed that the pg_am changes removed SQL access to a bunch of AM info. [1] indicates that's part of the purpose of the patch; are we sure no tools are using this info? Unlike previous catalog compatibility breaks, this one completely removes that information, so if someone was using it they're now completely hosed. [1] http://www.postgresql.org/message-id/55fec1ab.8010...@2ndquadrant.com -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup tools ought to ensure created backups are durable
On 3/28/16 11:03 AM, Magnus Hagander wrote: That should work yeah. And given that we already use that check in other places, it seems it should be perfectly safe. And as long as we only do a WARNING and not abort if the fsync fails, we should be OK if people intentionally store their backups on an fs that doesn't speak fsync (if that exists), in which case I don't really think we even need a switch to turn it off. I'd even go so far as spitting out a warning any time we can't fsync (maybe that's what you're suggesting?) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alter or rename enum value
On 3/28/16 4:42 AM, Emre Hasegeli wrote: Now, we are using a function to replace an enum type on all tables with another one, but we are not at all happy with this solution. It requires all objects which were using the enum to be dropped and recreated, and it rewrites the tables, so it greatly increases the migration time and effort. FWIW, there are ways to avoid some of that pain by having a trigger maintain the new column on INSERT/UPDATE and then slowly touching all the old rows where the new column is NULL. Obviously would be much better if we could just do this with ENUMs... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Weird behavior during CREATE EXTENSION
This behavior had be quite baffled... ~@decina.local/29760# create extension "trunklet-format" CASCADE; NOTICE: installing required extension "trunklet" NOTICE: installing required extension "variant" CREATE EXTENSION ~@decina.local/29760# create extension "pgxntool-test"; ERROR: syntax error at or near "-" LINE 1: create extension "pgxntool-test"; ^ ~@decina.local/29760# select * from pg_available_extensions where name ~'-'; name | default_version | installed_version | comment -+-+---+- pgxntool-test | 0.1.0 | | trunklet-format | 0.1.1 | 0.1.1 | A format()-based template language for trunklet (2 rows) Eventually, I realized the problem was the first line of the extension file itself: CREATE FUNCTION pgxntool-test( wrapping that in "s fixed the issue. (The reason that still doesn't line up with the ^ above is because the ^ is accounting for "LINE 1: ".) This makes debugging extensions quite tedious. Part of the explanation is in the comment for execute_sql_string(): /* * Execute given SQL string. * * filename is used only to report errors. * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course * this fails if there are any plannable statements referring to objects * created earlier in the script. A lesser annoyance is that SPI insists * on printing the whole string as errcontext in case of any error, and that * could be very long. */ I can think of 4 ways to fix this: 1) Have psql parse the script into separate commands for us. 2) Pull enough of psql's parsing into the backend code to be able to do #1 3) Add *file* line numbers to the output of pg_parse_query() 4) Have ereport spit out the context you'd normally get from SPI if it sees that it was called from somewhere underneath execute_sql_string(). My preference would actually be #1, because that would make it easy for any tool that wanted to to get access to that. Jon Erdman actually looked into a similar alternative in the past and it was only a few lines of code. Basically, when the "parse file" option is chosen don't even attempt to connect to a database, just parse things, execute \ commands and print the results instead of sending them via libpq. That wouldn't work directly here because we want to split commands apart, but it wouldn't be hard to have psql spit out a special command separator line and then look for that. psql would have to ignore \quit in this mode though, but I think that's fine. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 1/12/16 11:25 AM, Catalin Iacob wrote: >The differentiation between Error and SPIError is wrong, because there isn't >any difference in reality. They're similar but not really the same thing. raise Error and plpy.error are both ways to call ereport(ERROR, ...) while SPIError is raised when coming back after calling into Postgres to execute SQL that itself raises an error. Now indeed, that executed SQL raised an error itself via another ereport(ERROR, ...) somewhere but that's a different thing. Why should they be different? An error is an error. You either want to trap a specific type of error or you don't. Having two completely different ways to do the same thing is just confusing. IMHO the Error and Fatal classes should never have been committed, especially since they're undocumented. It's not the responsibility of this patch to remove them, but it certainly shouldn't dig the hole deeper. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Weird behavior during CREATE EXTENSION
On 1/12/16 5:00 PM, Tom Lane wrote: There's certainly room to improve error reporting for extension scripts, but I think you are heading into a dead end in the name of saving a little time coding. I'd suggest looking into an errcontext callback, instead. Also, there's some technology in CREATE FUNCTION that deals with the fact that we may be calling the parser on a string different from the original user command, which might be worth borrowing here --- at least part of the confusion is that it's evidently reporting a cursor position relative to the extension script as though it applied to the CREATE EXTENSION. Are you talking about plpgsql_compile_error_callback()? It looks like it does it's magic by relying on the plpgsql parser to keep track of where it's at. ISTM part of the goal here should be to show what the actual command was that failed (ie: the command in the extension file). I'm guessing the way to do that would be to have pg_parse_query() keep the original statement in the parse nodes? I guess if this was easy it would already have been fixed... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interesting read on SCM upending software and hardware architecture
On 1/18/16 2:47 PM, Peter Geoghegan wrote: On Mon, Jan 18, 2016 at 12:31 PM, Robert Haas wrote: People keep predicting the death of spinning media, but I think it's not happening to anywhere near as fast as that people think. Yes, I'm writing this on a laptop with an SSD, and my personal laptop also has an SSD, but their immediate predecessors did not, and these are fairly expensive laptops. And most customers I talk to are still using spinning disks. Meanwhile, main memory is getting so large that even pretty significant databases can be entirely RAM-cached. So I tend to think that this is a lot less exciting than people who are not me seem to think. I tend to agree that the case for SSDs as a revolutionary technology has been significantly overstated. This recent article makes some interesting points: http://www.zdnet.com/article/what-we-learned-about-ssds-in-2015/ I think it's much more true that main memory scaling (in particular, main memory capacity) has had a huge impact, but that trend appears to now be stalling. My original article doesn't talk about SSDs; it's talking about non-volatile memory architectures (quoted extract below). Fusion IO is an example of this, and if NVDIMMs become available we'll see even faster non-volatile performance. To me, the most interesting point the article makes is that systems now need much better support for multiple classes of NV storage. I agree with your point that spinning rust is here to stay for a long time, simply because it's cheap as heck. So systems need to become much better at moving data between different layers of NV storage so that you're getting the biggest bang for the buck. That will remain critical as long as SCM's remain 25x more expensive than rust. Quote from article: Flash-based storage devices are not new: SAS and SATA SSDs have been available for at least the past decade, and have brought flash memory into computers in the same form factor as spinning disks. SCMs reflect a maturing of these flash devices into a new, first-class I/O device: SCMs move flash off the slow SAS and SATA buses historically used by disks, and onto the significantly faster PCIe bus used by more performance-sensitive devices such as network interfaces and GPUs. Further, emerging SCMs, such as non-volatile DIMMs (NVDIMMs), interface with the CPU as if they were DRAM and offer even higher levels of performance for non-volatile storage. Today's PCIe-based SCMs represent an astounding three-order-of-magnitude performance change relative to spinning disks (~100K I/O operations per second versus ~100). For computer scientists, it is rare that the performance assumptions that we make about an underlying hardware component change by 1,000x or more. This change is punctuated by the fact that the performance and capacity of non-volatile memories continue to outstrip CPUs in year-on-year performance improvements, closing and potentially even inverting the I/O gap. The performance of SCMs means that systems must no longer "hide" them via caching and data reduction in order to achieve high throughput. Unfortunately, however, this increased performance comes at a high price: SCMs cost 25x as much as traditional spinning disks ($1.50/GB versus $0.06/GB), with enterprise-class PCIe flash devices costing between three and five thousand dollars each. This means that the cost of the non-volatile storage can easily outweigh that of the CPUs, DRAM, and the rest of the server system that they are installed in. The implication of this shift is significant: non-volatile memory is in the process of replacing the CPU as the economic center of the datacenter. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Rethinking TRANSFORM FOR TYPE ...
I'm using the TRANSFORM feature to implement a new data type for python (ndarrays from numpy). I'm constantly getting tripped up by forgetting to add TRANSFORM FOR TYPE. Worse, the requirement for explicitly stating transform means I can't use a polymorphic type. In the case of adding a new transform for an existing type, current behavior makes sense; you'll break all existing functions using the type if you just swap the representation out under them. Further, if you are pulling in some new extension that uses the same language and type, that function will be expecting the old representation, not the new one. For the case of creating a new data type, I think explicitly requiring the TRANSFORM clause makes no sense. It's a bunch of extra work for users that adds no benefit. A simple way to fix this would be to allow simply marking a transform as being DEFAULT. If a transform is marked as DEFAULT then it would automatically get used. Perhaps a better way would be allowing for multiple transforms for each language and type. That way users aren't stuck with a single preconceived notion of how to represent a type. The immediate use I see for that is it would allow a transform to be created in something other than C, as long as the language you want to use can handle a raw C string. That desire might sound silly to a lot of -hackers, but given the amount of pain I went through figuring out how to properly marshal an ndarray back and forth in C, I sure as hell wish I could have done it in python! Since plpythonu understands bytea, I don't see any reason I couldn't have. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 1/21/16 4:57 PM, Pavel Stehule wrote: It is not correct - outside PLPython you got a Error (PostgreSQL error has not any classes), and isn't important the raising class (Error or SPIError). Inside PL/Python you will got SPIError or successors (based on SQLcode). Right. The closest thing we have to error classes is SQLSTATE. If someone found a clever way to setup an exception inheritance tree[1] on that then maybe different exceptions would make sense. Short of that, I don't see it. [1] There's a hierarchy to the SQL state codes, based on the first 2 characters. So if there was... class connection_exception(spi_exception) __init__ str = 'Connection Exception' class connection_does_not_exist(connection_exception) __init__ str = 'Connection Does Not Exist" ... to map to the small set of errors below, maybe that would make sense. Obviously that would need to be auto-generated. It seems more trouble than it's worth though. Section: Class 08 - Connection Exception 08000EERRCODE_CONNECTION_EXCEPTION connection_exception 08003EERRCODE_CONNECTION_DOES_NOT_EXIST connection_does_not_exist 08006EERRCODE_CONNECTION_FAILURE connection_failure 08001EERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION sqlclient_unable_to_establish_sqlconnection 08004EERRCODE_SQLSERVER_REJECTED_ESTABLISHMENT_OF_SQLCONNECTION sqlserver_rejected_establishment_of_sqlconnection 08007EERRCODE_TRANSACTION_RESOLUTION_UNKNOWN transaction_resolution_unknown 08P01EERRCODE_PROTOCOL_VIOLATION protocol_violation -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Releasing in September
On 1/21/16 2:29 AM, Amit Kapila wrote: I also think there should be some way to give credit to CFM, if it is difficult to do anything related to money, then we can enforce that if CFM submits any patches for next CF, then those should be prioritised. Personally, I don't see why we have our scarcest resource doing what is essentially a project management task, especially when at least one commercial company has offered to donate paid staff time. I don't think the last CF of 9.6 is the time to experiment, but I think we should try using a PM for the first CF of 9.7. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Releasing in September
On 1/20/16 11:40 AM, Tom Lane wrote: Yeah. It's certainly unfair if someone's patch doesn't get reviewed, but there are only 24 hours in a day, and we have a limited pool of reviewer and committer manpower. I think we just have to say that sometimes life is unfair. I think that's a great way to ensure we shrink the pool of reviewers when someone works on a patch and then it goes nowhere. I find it rather difficult to get feedback on ideas before I spend the time to code something, it's got to be even worse for someone the community doesn't know. So if we're going to do this, I think there must be a mechanism for a patch idea/design to be approved. I think we also need to be careful about -hackers being the only place feature desirability is measured. There's an entire world of users out there that aren't even on -general. If some feature doesn't really interest -hackers but there's 50 users that want it and someone willing to work on it, ISTM we should make efforts to get it committed. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Releasing in September
On 1/20/16 11:49 PM, Tom Lane wrote: Michael Paquier writes: On Thu, Jan 21, 2016 at 2:30 PM, Peter Geoghegan wrote: What benefit does porting sqlsmith for inclusion in core have? I can only think of costs, including those that you mentioned. We have automatic buildfarm coverage on many platforms. Perhaps we could live without that with a buildfarm module though. I do not think we should necessarily try to include every testing tool in the core distribution. What is important is that they be readily available: easy to find, easy to use, documented, portable. "Same license as the PG core code" is not on that list. An immediately relevant example is that the buildfarm server and client code aren't in the core distribution, and AFAIR no one has suggested that they need to be. Right. What I think would be far more useful is making it easier to explicitly test things (better tools + design for test), and something akin to buildfarm that will run automated testing on submitted patches. Put another way: it's stupid that we even ask reviewers to waste time running make check. That can be automated. Ideally reviewers shouldn't be doing any testing, because the tests that are part of the patch should answer every question they would have, but I don't see that happening until we have a separate automation-only target that we don't care how long it takes to run. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] count_nulls(VARIADIC "any")
On 1/21/16 1:48 PM, Pavel Stehule wrote: the form of regress tests is not pretty significant issue. Jim's design is little bit transparent, Marko's is maybe little bit practical. Both has sense from my opinion, and any hasn't significant advantage against other. any possible agreement, how these tests should be designed? simple patch, simple regress tests, so there are no reason for long waiting. I don't really see how individual tests are more practical (you can still cut and paste a table...), but since there's no strong consensus either way I'd say it's up to you as author. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Releasing in September
On 1/20/16 4:29 PM, Bruce Momjian wrote: On Wed, Jan 20, 2016 at 09:12:07AM -0800, Joshua Drake wrote: I just don't buy the Ubuntu release model for our database. Ubuntu is trying to balance hot features vs stability, while we are really focused on stability, similar to Debian. I understand but I think we are missing out on an opportunity here. Notice that the shorter release cycle for STS will actually make some things easier. Including: * Increased test base (just like Fedora/Ubuntu) * Increased early adopter testing (that is what early adopting is really about for us anyway) * Decreased concerns about upgrades and ability to extend upgrade status. I can see LTS working for plugin change, but not server binary changes. s/LTS/STS/? In any case, I think JD is onto something here. As someone that focuses more on user experience than "deep core" code, I already find yearly releases to be quite inconvenient. It's hard to find the motivation to make a minor improvement in something (especially knowing how hard it will be to get the patch approved) knowing that it won't see the light of day for a year, and realistically I won't be able to use it with any clients that are in production for 2-3 years. Given the high level of extensibility that we have, maybe it would be good to logically segregate stuff into things that are deeply embedded in the "core" code (ie: on-disk format) from things that are much easier to change when necessary (like add-on functions or PLs). Things like new JSON operators could be released much more rapidly that way. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Releasing in September
On 1/22/16 12:14 PM, Andres Freund wrote: On 2016-01-22 08:40:28 -0600, Jim Nasby wrote: Ideally reviewers shouldn't be doing any testing, because the tests that are part of the patch should answer every question they would have, but I don't see that happening until we have a separate automation-only target that we don't care how long it takes to run. I think that's completely wrong. Yes, more tests are good, and we need a place for longer running tests. But assuming that every patch author will create a testsuite that covers every angle is just about akin to assuming every submitter will deliver perfect, bug free code. And we know how well that turns out. I think actively trying to break a feature, and postgres in general, is one of the most important tasks of reviewers and testers. And with that I don't mean trying to run "make check". Look e.g. at the tests Jeff Janes has performed, what the recent plug tests of Tomas Vondra brought to light, or at what the full page write checker tool of Heikki's showed. IIRC Jeff's tests are scripted and obviously the page write checker is as well. I don't recall the exact methodology Tomas was using but I suspect it could also be scripted if you had a way to pull the plug via software (via a power management unit or maybe kill -9 of a VM). All of that is stuff that can and should be automated. Presumably it won't ever be part of the Makefile tests, but that's fine. Heck, the test scripts could stay in completely separate repos. Where the code lives isn't the issue; it's getting stuff like this automated so humans can go back do doing things that can't be automated. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL Re-Writes
On 1/31/16 3:26 PM, Jan Wieck wrote: On 01/27/2016 08:30 AM, Amit Kapila wrote: operation. Now why OS couldn't find the corresponding block in memory is that, while closing the WAL file, we use POSIX_FADV_DONTNEED if wal_level is less than 'archive' which lead to this problem. So with this experiment, the conclusion is that though we can avoid re-write of WAL data by doing exact writes, but it could lead to significant reduction in TPS. POSIX_FADV_DONTNEED isn't the only way how those blocks would vanish from OS buffers. If I am not mistaken we recycle WAL segments in a round robin fashion. In a properly configured system, where the reason for a checkpoint is usually "time" rather than "xlog", a recycled WAL file written to had been closed and not touched for about a complete checkpoint_timeout or longer. You must have a really big amount of spare RAM in the machine to still find those blocks in memory. Basically we are talking about the active portion of your database, shared buffers, the sum of all process local memory and the complete pg_xlog directory content fitting into RAM. But that's only going to matter when the segment is newly recycled. My impression from Amit's email is that the OS was repeatedly reading even in the same segment? Either way, I would think it wouldn't be hard to work around this by spewing out a bunch of zeros to the OS in advance of where we actually need to write, preventing the need for reading back from disk. Amit, did you do performance testing with archiving enabled an a no-op archive_command? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?
On 1/31/16 7:38 PM, Kouhei Kaigai wrote: > I'm under investigation of SSD-to-GPU direct feature on top of > the custom-scan interface. It intends to load a bunch of data > blocks on NVMe-SSD to GPU RAM using P2P DMA, prior to the data > loading onto CPU/RAM, to preprocess the data to be filtered out. > It only makes sense if the target blocks are not loaded to the > CPU/RAM yet, because SSD device is essentially slower than RAM. > So, I like to have a reliable way to check the latest status of > the shared buffer, to kwon whether a particular block is already > loaded or not. That completely ignores the OS cache though... wouldn't that be a major issue? To answer your direct question, I'm no expert, but I haven't seen any functions that do exactly what you want. You'd have to pull relevant bits from ReadBuffer_*. Or maybe a better method would just be to call BufTableLookup() without any locks and if you get a result > -1 just call the relevant ReadBuffer function. Sometimes you'll end up calling ReadBuffer even though the buffer isn't in shared buffers, but I would think that would be a rare occurrence. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] statistics for shared catalogs not updated when autovacuum is off
On 1/30/16 5:05 PM, Peter Eisentraut wrote: When autovacuum is off, the statistics in pg_stat_sys_tables for shared catalogs (e.g., pg_authid, pg_database) never update. So seq_scan doesn't update when you read the table, last_analyze doesn't update when you run analyze, etc. But when you shut down the server and restart it with autovacuum on, the What about with autovacuum still off? updated statistics magically appear right away. So seq_scan is updated with the number of reads you did before the shutdown, last_analyze updates with the time of the analyze you did before the shutdown, etc. So the data is saved, just not propagated to the view properly. I can reproduce this back to 9.3, but not 9.2. Any ideas? ISTR that there's some code in the autovac launcher that ensures certain stats have been loaded from the file into memory; I'm guessing that the functions implementing the shared catalog views need something similar. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trigger.sgml
On 1/28/16 8:02 PM, Tatsuo Ishii wrote: I am working as a volunteer to translate docs to Japanese. I have been having hard time to parse the following sentence in doc/src/sgml/trigger.sgml. The possibility of surprising outcomes should be considered when there are both BEFORE INSERT and BEFORE UPDATE row-level triggers that both affect a row being inserted/updated (this can still be problematic if the modifications are more or less equivalent if they're not also idempotent). Especially I don't understand this part: (this can still be problematic if the modifications are more or less equivalent if they're not also idempotent). It would be great if someone could enligntend me. I believe the idea here is that thanks to UPSERT you can now get very strange behavior if you have BEFORE triggers that aren't idempotent. IE: CREATE TABLE test( a int PRIMARY KEY ); BEFORE INSERT a = a - 1 BEFORE UPDATE a = a + 1 INSERT (1) -- Results in 0 INSERT (2) -- Results in 1 Now if you try to UPSERT (1), the before insert will give you a=0, which conflicts. So then you end up with an UPDATE, which gives you a=1 again. Things are even worse when you try to UPSERT (2), because the insert conflicts but then you try to update a row that doesn't exist (a=2). Obviously this is a ridiculous example, but hopefully it shows the problem. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add links to commit fests to patch summary page
It would be nice if the patch summary page (ie, [1]) had links to the relevant entry in that CF. The specific need I see is if you look up a patch in the current CF and it's been moved to the next CF you have to manually go to that CF and search for the patch. [1] https://commitfest.postgresql.org/9/353/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, failed Spec compliant: not tested Documentation:not tested There are currently no docs or unit tests. I suspect this patch is still WIP? create-am.5.patch: General notes: Needs catversion bump. IndexQualInfo and GenericCosts have been moved to src/include/utils/selfuncs.h. METHOD becomes an unreserved keyword. generic-xlog.5.patch: generic_xlog.c: At least needs a bunch of explanatory comments, if not info in a README. Since I don't really understand what the design here is my review is just cursory. static memoryMove() - seems like an overly-generic function name to me... writeCopyFlagment(), writeMoveFlagment(): s/Fl/Fr/? bloom-control.5: README: + CREATE INDEX bloomidx ON tbloom(i1,i2,i3) +WITH (length=5, col1=2, col2=2, col3=4); + + Here, we create bloom index with signature length 80 bits and attributes + i1, i2 mapped to 2 bits, attribute i3 - to 4 bits. It's not clear to me where 80 bits is coming from... bloom.h: + #define BITBYTE (8) ISTR seeing this defined somewhere in the Postgres headers; dunno if it's worth using that definition instead. Testing: I ran the SELECT INTO from the README, as well as CREATE INDEX bloomidx. I then ran insert into tbloom select (generate_series(1,1000)*random())::int as i1, (generate_series(1,1000)*random())::int as i2, (generate_series(1,1000)*random())::int as i3, (generate_series(1,1000)*random())::int as i4, (generate_series(1,1000)*random())::int as i5, (generate_series(1,1000)*random())::int as i6, (generate_series(1,1000)*random())::int as i7, (generate_series(1,1000)*random())::int as i8, (generate_series(1,1000)*random())::int as i9, (generate_series(1,1000)*random())::int as i10, (generate_series(1,1000)*random())::int as i11, (generate_series(1,1000)*random())::int as i12, (generate_series(1,1000)*random())::int as i13 from generate_series(1,999); and kill -9'd the backend. After restart I did VACUUM VERBOSE without issue. I ran the INSERT INTO, kill -9 and VACUUM VERBOSE sequence again. This time I got an assert: TRAP: FailedAssertion("!(((bool) (((const void*)((ItemPointer) left) != ((void*)0)) && (((ItemPointer) left)->ip_posid != 0", File: "vacuumlazy.c", Line: 1823) That line is lblk = ItemPointerGetBlockNumber((ItemPointer) left); which does AssertMacro(ItemPointerIsValid(pointer)), \ which is the assert that's failing. I've squirreled this install away for now, in case you can't repro this failure. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 2/1/16 4:59 PM, Alvaro Herrera wrote: Masahiko Sawada wrote: Attached updated version patch. Please review it. In pg_upgrade, the "page convert" functionality is there to abstract rewrites of pages being copied; your patch is circumventing it and AFAICS it makes the interface more complicated for no good reason. I think the real way to do that is to write your rewriteVisibilityMap as a pageConverter routine. That should reduce some duplication there. IIRC this is about the third problem that's been found with pg_upgrade in this patch. That concerns me given the potential for disaster if freeze bits are set incorrectly. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising the checkpoint_timeout limit
On 2/1/16 6:13 PM, Andres Freund wrote: I'm not sure what'd actually be a good upper limit. I'd be inclined to even go to as high as a week or so. A lot of our settings have upper/lower limits that aren't a good idea in general. The only reason I can see for the 1 hour limit is to try and prevent footguns. I think that's a valid goal, but there should be a way to over-ride it. And if we don't want that kind of protection then I'd say just yank the upper limit. I'm also wondering if it'd not make sense to raise the default timeout to 15min or so. The upper ceiling for that really is recovery time, and that has really shrunk rather drastically due to faster cpus and architectural improvements in postgres (bgwriter, separate checkpointer/bgwriter, restartpoints, ...). It would be interesting if someone had a large-ish 9.4 or 9.5 install that they could test recovery timing on. My suspicion is that as long as FPWs are on that you'd generally end up limited by how fast you could read WAL unless you exceeded the FS cache. (I'm assuming a BBU and that the FS and controller will do a nice job of ordering writes optimally so that you'll get performance similar to reads when it's time to fsync.) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] statistics for shared catalogs not updated when autovacuum is off
On 2/1/16 7:20 PM, Peter Eisentraut wrote: That's probably right. Even with autovacuum on, the statistics for shared catalogs do not appear as updated right away. That is, if you run VACUUM and then look at pg_stat_sys_tables right away, you will see the stats for shared catalogs to be slightly out of date until the minutely autovacuum check causes them to update. So the problem exists in general, but the autovacuum launcher papers over it every minute. I suspect the issue is in backend_read_statsfile(). Presumably the if just needs a call to AutoVacuumingActive() added: /* * Autovacuum launcher wants stats about all databases, but a shallow read * is sufficient. */ if (IsAutoVacuumLauncherProcess()) pgStatDBHash = pgstat_read_statsfiles(InvalidOid, false, false); else pgStatDBHash = pgstat_read_statsfiles(MyDatabaseId, false, true); The interesting thing is that we always start the launcher one time, to protect against wraparound, but apparently that path doesn't call anything that calls backend_read_statsfile() (which is static). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add links to commit fests to patch summary page
On 2/1/16 6:15 PM, Alvaro Herrera wrote: Jim Nasby wrote: It would be nice if the patch summary page (ie, [1]) had links to the relevant entry in that CF. The specific need I see is if you look up a patch in the current CF and it's been moved to the next CF you have to manually go to that CF and search for the patch. Agreed, I could use that. In the "status" row, each commitfest entry (the "2015-11" text) could be a link to that patch in that commitfest. Yeah, what I was thinking. (You can actually construct the URL easily just by changing the commitfest ID, which is the first number in the URL; for example 2016-01 is /8/). *waits for someone to comment on how surrogate keys are bad* ;P -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add links to commit fests to patch summary page
On 2/2/16 6:35 AM, Alvaro Herrera wrote: what I want is the link to go to *that patch's* page in the other commitfest. That's also what I think Jim wants. +1 -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Auditing
On 2/2/16 10:34 AM, Joshua D. Drake wrote: Auditing is a pretty security/enterprisey-related thing that could do with the "officially considered to of the PostgreSQL project standard and ready for production" rubber-stamp that tends to go along with most end-user/admin-oriented stuff shipped in the tarball. Which is exactly why I think .Org needs an official "Extensions" project which would completely eliminate these arguments. A project team explicitly for vetting extensions. Yeah, it's disappointing that PGXN doesn't seem to have really taken off. I'm sure a big part of that is the need for even SQL extensions to have server access, but I suspect part of it is because it's a separate project. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Auditing
On 2/2/16 5:00 AM, Simon Riggs wrote: Since you've written the email here, I'd ask that you join our community and use your knowledge and passion to make things happen. +1. Kudos for speaking up in the first place, but it's clear that right now the biggest thing holding Postgres back is lack of reviewers, followed by lack of developers. If your company put even 10% of what it would pay for Oracle or MSSQL licensing back into the community (either via direct employee involvement or by funding development) then auditing could have moved a lot faster than it did. It would also help if the community better publicized ways that companies could give back. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Integer overflow in timestamp_part()
On 2/2/16 6:39 PM, Tom Lane wrote: I'm inclined to think that a good solution would be to create an artificial restriction to not accept years beyond, say, 10 AD. That would leave us with a lot of daylight to not have to worry about corner-case overflows in timestamp arithmetic. I'm not sure though where we'd need to enforce such a restriction; certainly in timestamp[tz]_in, but where else? Probably some of the casts (I'd think at least timestamp->timestamptz). Maybe timestamp[tz]_recv. Most of the time*pl* functions. :/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Auditing
On 2/2/16 7:25 PM, Curtis Ruck wrote: I'm opening to testing and evaluating to see if it meets our compliance requirements, but I am no where close to being a C developer, or having C developers that could actually provide a meaningful review. One issue along this thread already pops up, concerning the client_min_messages, and how other patches in the pipeline for 9.6 would be required to enable the auditing to meet compliance requirements. There's other ways you can help besides reviewing. Providing real-world use cases helps. Even better is maintaining things on the wiki that assist with moving things forward (use cases, discussion/decision highlights, really anything that helps move the discussion). It just seems after reading the mailing list history, that there is a lack of interest by people with commit authority, even though there is a decent interest in it from the community, and honestly, no one really likes auditing, but its one of those damned if you do (in performance) and damned if you don't (in legal) things. Yeah, no one that's volunteering time (as opposed to being paid to work on PG) is going to pick up something as unsexy and painful to deal with as auditing. Additionally Robert, given your professional status, you are by no means an unbiased contributor in this discussion. Your stance on this matter shows that you don't necessarily want the open source solution to succeed in the commercial/compliance required space. Instead of arguing I'm sorry, but that's just ridiculous, and I completely agree with Robert's initial sentiment: there needs to be a damn good reason for the community to pick one specific implementation of something when there are competing solutions. blankly against inclusion can you at least provide actionable based feedback that if met would allow patches of this magnitude in? It works just like any other patch does: the community has to come to a *consensus* that not only is the feature desired and well designed, but that the implementation is high quality. I haven't followed the auditing discussions closely, but it seems that there are still questions around how the feature should work. I'm personally fine with fiscally rewarding organizations that assist my customer in succeeding, but its hard to convince my customer to fund open source, even though they wouldn't be able to do 75% of what they do without it. Based on past experience this is the same most open source organizations face, especially when they don't have the marketing engine that the large commercial players have. I really don't understand that, given what most of the alternative solutions cost. If they balk at putting money towards developing Postgres they really need to get a quote for running the same amount of MSSQL (let alone Oracle, which is even more expensive). I do think the community could do a better job of at least encouraging companies to fund development. Unfortunately there's always going to be some amount of friction here though, because of the question of how to allocate funds to the different companies that are involved. Another problem is no commercial company can actually guarantee anything will make it into community Postgres, and it's very difficult to even estimate the amount of effort (read as: what to charge) for getting a feature committed. Commercial development is certainly possible though. 2nd Quadrant was able to raise a good amount of money to fund the development of hot standby. IIRC that was before sites like kickstarter existed too, so it would probably be even easier to do today. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 2/2/16 4:52 PM, Alvaro Herrera wrote: Robert Haas wrote: The eventual committer is likely to be much happier with this patch if you guys have achieved consensus among yourselves on the best approach. (Disclaimer: The eventual committer won't be me. I'm not a Python guy. But we try to proceed by consensus rather than committer-dictat around here, when we can. Obviously the committer has the final say at some level, but it's better if that power doesn't need to be exercised too often.) Actually I imagine that if there's no agreement between author and first reviewer, there might not *be* a committer in the first place. Perhaps try to get someone else to think about it and make a decision. It is possible that some other committer is able to decide by themselves but I wouldn't count on it. +1. FWIW, I'd think it's better to not break backwards compatibility, but I'm also far from a python expert. It might well be worth adding a plpython GUC to control the behavior so that there's a migration path forward, or maybe do something like the 'import __future__' that python is doing to ease migration to python 3. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add links to commit fests to patch summary page
On 2/2/16 9:46 AM, Magnus Hagander wrote: I'm not entirely sure what I'd use that for myself, but that's trivial to implement. Thus, done and published. FWIW, my use case was actually to go from https://commitfest.postgresql.org/8/353/ to https://commitfest.postgresql.org/9/353/, which I needed to do in order to submit the review. So, perhaps another enhancement would be for commenting/reviewing to always work regardless of what commitfest ID is in the URL. Perhaps Edit too. I think the Status dropdown needs to stay specific to the correct CF though. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idle In Transaction Session Timeout, revived
On 2/3/16 2:30 PM, Robert Haas wrote: On Sun, Jan 31, 2016 at 8:33 AM, Vik Fearing wrote: Attached is a rebased and revised version of my idle_in_transaction_session_timeout patch from last year. This version does not suffer the problems the old one did where it would jump out of SSL code thanks to Andres' patch in commit 4f85fde8eb860f263384fffdca660e16e77c7f76. The basic idea is if a session remains idle in a transaction for longer than the configured time, that connection will be dropped thus releasing the connection slot and any locks that may have been held by the broken client. Added to the March commitfest. +1 for doing something like this. Great idea! Wouldn't it be more sensible to just roll the transaction back and not disconnect? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Audit Extension
On 2/3/16 10:36 AM, Robert Haas wrote: People who are interested in audit are also understandably leery of >downloading code from an untrusted source. Both PGXN and GitHub are The >Wild West as far as conservative auditors are concerned. I hate to be rude here, but that's not my problem. You can put it on your corporate web site and let people download it from there. I'm sure that auditors are familiar with the idea of downloading software from for-profit companies. Do they really not use any software from Microsoft or Apple, for example? If the problem is that they will trust the PostgreSQL open source project but not YOUR company, then I respectfully suggest that you need to establish the necessary credibility, not try to piggyback on someone else's. Luckily pgaudit is it's own group on Github (https://github.com/pgaudit), so it doesn't even have to be controlled by a single company. If others care about auditing I would hope that they'd contribute code there and eventually become a formal member of the pgaudit project. As for PGXN being an untrusted source, that's something that it's in the project's best interest to try and address somehow, perhaps by having formally audited extensions. Amazon already has to do this to some degree before an extension can be allowed in RDS, and so does Heroku, so maybe that would be a starting point. I think a big reason Postgres got to where it is today is because of it's superior extensibility, and I think continuing to encourage that with formal support for things like PGXN is important. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raising the checkpoint_timeout limit
On 2/2/16 10:10 PM, Robert Haas wrote: Now, you could also set such configuration settings in a situation where it will not work out well. But that is true of most configuration settings. Yeah, if we're going to start playing parent then I think the first thing to do is remove the fsync GUC. The AWS team has done testing that shows it to be worthless from a performance standpoint now that we have synchronous commit, and it's an extremely large foot-bazooka to have laying around. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idle In Transaction Session Timeout, revived
On 2/3/16 4:05 PM, David Steele wrote: On 2/3/16 4:25 PM, Tom Lane wrote: Robert Haas writes: On Wed, Feb 3, 2016 at 3:41 PM, Jim Nasby wrote: Wouldn't it be more sensible to just roll the transaction back and not disconnect? I'm not sure how messy this would be in practice. But if we think that killing the whole session is not desirable but something we're doing for expediency, then it would be worth looking into that approach. I think killing the session is a perfectly sensible thing to do in this case. Everything meaningful that was done in the session will be rolled back - no need to waste resources keeping the connection open. Except you end up losing stuff like every GUC you've set, existing temp tables, etc. For an application that presumably doesn't matter, but for a user connection it would be a PITA. I wouldn't put a bunch of effort into it though. Dropping the connection is certainly better than nothing. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Batch update of indexes
On 1/21/16 11:47 AM, Konstantin Knizhnik wrote: BTW, could you explain, what is the reason to copy data into the pending list and then copy it again while flushing pending list into the index? Why not read this data directly from the table? I feel that I've missed something important here. No, I do not think that inserted data should be placed in pending list and then copied to main table. It should be stored directly in the main table and "pending list" is just some fast, transient index. That sounds similar to what we would need to support referencing OLD and NEW in per-statement triggers: a good way to find everything that was changed in a statement. Or if you will, s/statement/transaction/. Having that is probably a prerequisite for doing incremental refresh materialized views. My suspicion is that it would be useful to pre-order the new data before trying to apply it to the indexes. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Sanity checking for ./configure options?
I just discovered that ./configure will happily accept '--with-pgport=' (I was actually doing =$PGPORT, and didn't realize $PGPORT was empty). What you end up with is a compile error in guc.c, with no idea why it's broken. Any reason not to have configure or at least make puke if pgport isn't valid? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "using previous checkpoint record at" maybe not the greatest idea?
On 2/4/16 5:09 PM, David G. Johnston wrote: What the 2nd para in the documentation is saying is something different: it is talking about reading all the pg_xlog files (in reverse order), which is not pg_control, and see what checkpoint records are there, then figure out which one to use. Yes, I inferred something that obviously isn't true - that the system doesn't go hunting for a valid checkpoint to begin recovery from. While it does not do so in the case of a corrupted pg_control file I further assumed it never did. That would be because the documentation doesn't make the point of stating that two checkpoint positions exist and that PostgreSQL will try the second one if the first one proves unusable. Given the topic of this thread that omission makes the documentation out-of-date. Maybe its covered elsewhere but since this section addresses locating a starting point I would expect any such description to be here as well. Yeah, I think we should fix the docs. Especially since I imagine that if you're reading that part of the docs you're probably having a really bad day, and bad info won't help you... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "using previous checkpoint record at" maybe not the greatest idea?
On 2/4/16 3:37 PM, Andres Freund wrote: On 2016-02-03 09:28:24 -0500, Robert Haas wrote: Would we still have some way of forcing the older checkpoint record to be used if somebody wants to try to do that? I think currently the best way to force an arbitrary checkpoint to be used is creating a "custom" backup label. Not that nice. Not sure if we need something nice here, I don't really see a frequent need for this. We could add another option to pg_resetxlog alternatively :/ I guess you'd have to scan through WAL files by hand to find the next oldest checkpoint? I'm guessing that if this is happening in the field there's a decent chance people aren't noticing it, so maybe the best thing for now is to turn off the automatic behavior bust still have a relatively easy way to re-enable it. In case this is more common than we think... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 2/4/16 3:13 AM, Catalin Iacob wrote: Thanks for the overview. Very helpful. I find existing behaviour for 2, 3 and 4 unlike other Python APIs I've seen, surprising and not very useful. If I want to log a tuple I can construct and pass a single tuple, I don't see why plpy.info() needs to construct it for me. And for the documented, single argument call nothing changes. Agreed, that usage is wonky. The question to Bruce (and others) is: is it ok to change to the new behaviour illustrated and change meaning for usages like 2, 3 and 4? If any users have a bunch of code that depends on the old behavior, they're going to be rather irritated if we break it. If we want to depricate it then I think we need a GUC that allows you to get the old behavior back. If we don't want that, the solution Pavel and I see is introducing a parallel API named plpy.raise_info or plpy.rich_info or something like that with the new behaviour and leave the existing functions unchanged. Another option is some compatibility GUC but I don't think it's worth the trouble and confusion. If we're going to provide an alternative API, I'd just do plpy.raise(LEVEL, ...). At this point, my vote would be: Add a plpython.ereport_mode GUC that has 3 settings: current (deprecated) behavior, allow ONLY 1 argument, new behavior. The reason for the 1 argument option is it makes it much easier to find code that's still using the old behavior. I think it's also worth having plpy.raise(LEVEL, ...) as an alternative. If folks feel that's overkill then I'd vote to leave the existing behavior alone and just add plpy.raise(LEVEL, ...). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?
On 2/4/16 12:30 AM, Kouhei Kaigai wrote: >> 2. A feature to suspend i/o write-out towards a particular blocks >> >that are registered by other concurrent backend, unless it is not >> >unregistered (usually, at the end of P2P DMA). >> >==> to be discussed. I think there's still a race condition here though... A finds buffer not in shared buffers B reads buffer in modifies buffer starts writing buffer to OS A Makes call to block write, but write is already in process; thinks writes are now blocked Reads corrupted block Much hilarity ensues Or maybe you were just glossing over that part for brevity. ... > I tried to design a draft of enhancement to realize the above i/o write-out > suspend/resume, with less invasive way as possible as we can. > >ASSUMPTION: I intend to implement this feature as a part of extension, >because this i/o suspend/resume checks are pure overhead increment >for the core features, unless extension which utilizes it. > > Three functions shall be added: > > extern intGetStorageMgrNumbers(void); > extern f_smgr GetStorageMgrHandlers(int smgr_which); > extern void SetStorageMgrHandlers(int smgr_which, f_smgr smgr_handlers); > > As literal, GetStorageMgrNumbers() returns the number of storage manager > currently installed. It always return 1 right now. > GetStorageMgrHandlers() returns the currently configured f_smgr table to > the supplied smgr_which. It allows extensions to know current configuration > of the storage manager, even if other extension already modified it. > SetStorageMgrHandlers() assigns the supplied 'smgr_handlers', instead of > the current one. > If extension wants to intermediate 'smgr_write', extension will replace > the 'smgr_write' by own function, then call the original function, likely > mdwrite, from the alternative function. > > In this case, call chain shall be: > >FlushBuffer, and others... > +-- smgrwrite(...) > +-- (extension's own function) > +-- mdwrite ISTR someone (Robert Haas?) complaining that this method of hooks is cumbersome to use and can be fragile if multiple hooks are being installed. So maybe we don't want to extend it's usage... I'm also not sure whether this is better done with an smgr hook or a hook into shared buffer handling... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Batch update of indexes
On 2/4/16 1:37 AM, konstantin knizhnik wrote: >My suspicion is that it would be useful to pre-order the new data before trying to apply it to the indexes. Sorry, but ALTER INDEX is expected to work for all indexes, not only B-Tree, and for them sorting may not be possible... But for B-Tree presorting inserted data should certainly increase performance. I will think about it. I wasn't talking about ALTER INDEX. My theory is that if you're doing a large DML operation it might be more efficient to update an index as a single bulk operation, instead of doing it for each tuple. If you want to do that, then you need an efficient method for finding everything that a DML statement changed. That's the exact same thing we need to support statement-level triggers being able to reference NEW and OLD. It's probably also what we need to support incremental update matviews. If we had such a capability then we could add options to the AM infrastructure to allow indexes to support doing bulk maintenance as well as per-tuple maintenance (or even support only bulk maintenance...) I don't think any of that has anything to do with ALTER INDEX. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers