Re: [HACKERS] [WIP] collation support revisited (phase 1)
On Mon, Jul 21, 2008 at 03:15:56AM +0200, Radek Strnad wrote: > I was trying to sort out the problem with not creating new catalog for > character sets and I came up following ideas. Correct me if my ideas are > wrong. > > Since collation has to have a defined character set. Not really. AIUI at least glibc and ICU define a collation over all possible characters (ie unicode). When you create a locale you take a subset and use that. Think about it: if you want to sort strings and one of them happens to contain a chinese charater, it can't *fail*. Note strcoll() has no error return for unknown characters. > I'm suggesting to use > already written infrastructure of encodings and to use list of encodings in > chklocale.c. Currently databases are not created with specified character > set but with specified encoding. I think instead of pointing a record in > collation catalog to another record in character set catalog we might use > only name (string) of the encoding. That's reasonable. From an abstract point of view collations and encodings are orthoginal, it's only when you're using POSIX locales that there are limitations on how you combine them. I think you can assume a collation can handle any characters that can be produced by encoding. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [HACKERS] Concurrent VACUUM and ANALYZE
Jonah H. Harris wrote: On Mon, Jul 21, 2008 at 10:19 PM, Tom Lane <[EMAIL PROTECTED]> wrote: I don't find this a compelling argument, at least not without proof that the various vacuum-improvement projects already on the radar screen (DSM-driven vacuum, etc) aren't going to fix your problem. Is DSM going to be in 8.4? The last I had heard, DSM+related improvements weren't close to being guaranteed for this release. If it doesn't make it, waiting another year and a half for something easily fixed would be fairly unacceptable. Should I provide a patch in the event that DSM doesn't make it? Can't hurt to submit a patch. Also, could you do something to help mitigate the worse case, something like don't update the stats in pg_class if the analyze finishes after a vacuum has finished since the current analyze started? Matt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema-qualified statements in pg_dump output
At 8:34 AM +0100 7/11/08, Simon Riggs wrote: On Mon, 2008-07-07 at 15:46 +0200, Bernd Helmle wrote: There's a behavior in pg_dump that annoyed me a little bit, the last few times i had to deal with it: Consider you have to dump a specific namespace only, you are going to use pg_dump -n [-t ]. I found it a common use case to restore this dump into a different schema by simply changing the search_path. With included ownerships this doesn't work, since pg_dump always outputs the necessary DDL as follows: ALTER TABLE bernd.foo OWNER TO bernd; Okay, it isn't too hard to use sed to replace the necessary statements to use the correct schema, but i think it would be much nicer if pg_dump would omit the schema-qualified table name here. I'd like to create a patch for this, if we agree on changing this behavior? The use case you mention is something that would be of value to many people, and I support your efforts to add a new option for this. No useful workarounds exist without flaws: i) editing with sed might well end up editing character data in the table(s) at the same time and you may never even notice. ii) reloading to the same schema (renaming etc) is not acceptable if the target has a production schema of that name already. iii) manually editing a large file is problematic. Tom's posted comments that you need to look at all of the places the schemaname is used to see what we will need/not need to change. It's more than just altering the owner, but that doesn't mean we don't want it or its impossible. Please pursue this further. I've been looking into this matter, although I'm a noob apropos PostgreSQL hacking. What I thought was a better way was to alter pg_dump to accept a flag -m . It would require the -n option or fail. It would generate a schema dump where all the references to were replaced by . This would allow you to easily make a copy of a schema into a new schema. My needs are that my production database is the "public" schema, and each year I want to archive "fy2007", "fy2008", etc. schemas which have the final information for those years. So at the end of this year, I want to duplicate the "public" schema into the "fy2008" schema, and continue with "public." I could do the pg_dump "public", rename "public" to "fy2008" and then restore "public," but this requires being without "public" for a short interval. It would be better for me to simply: pgsql database < pg_dump -c -n public -m fy2008 And that would give you a completely mechanical way to duplicate a schema, which means I could put it in a script that users could call. From what I've seen, it would mean finding where the schema is currently accessed in the code, then substituting on the -m flag. Having already done this with manually editing the files, it really cries out for a better procedure. Perhaps my solution is excessive compared to the other offered solution, but it would have the benefit that the user would know precisely what he was doing by the flag setting. -Owen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
On Mon, Jul 21, 2008 at 10:19 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Jonah H. Harris" <[EMAIL PROTECTED]> writes: >> The case I'm looking at is a large table which requires a lazy vacuum, >> and a zero vacuum cost delay would cause too much I/O. Yet, this >> table has enough insert/delete activity during a vacuum, that it >> requires a fairly frequent analysis to maintain proper plans. I >> patched as mentioned above and didn't run across any unexpected >> issues; the only one expected was that mentioned by Alvaro. > > I don't find this a compelling argument, at least not without proof that > the various vacuum-improvement projects already on the radar screen > (DSM-driven vacuum, etc) aren't going to fix your problem. Is DSM going to be in 8.4? The last I had heard, DSM+related improvements weren't close to being guaranteed for this release. If it doesn't make it, waiting another year and a half for something easily fixed would be fairly unacceptable. Should I provide a patch in the event that DSM doesn't make it? -Jonah -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > The case I'm looking at is a large table which requires a lazy vacuum, > and a zero vacuum cost delay would cause too much I/O. Yet, this > table has enough insert/delete activity during a vacuum, that it > requires a fairly frequent analysis to maintain proper plans. I > patched as mentioned above and didn't run across any unexpected > issues; the only one expected was that mentioned by Alvaro. I don't find this a compelling argument, at least not without proof that the various vacuum-improvement projects already on the radar screen (DSM-driven vacuum, etc) aren't going to fix your problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
On Mon, Jul 21, 2008 at 7:59 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Jonah H. Harris" <[EMAIL PROTECTED]> writes: >> Because we wouldn't want multiple ANALYZEs running on the same table, >> changing the lock back to an AccessShareLock doesn't sound like a >> solution. > > It flat will not work. We used to do it that way, and it didn't > (search for "tuple concurrently updated" in the archives). Agreed. >> However, what are the thoughts around creating another, >> more-specific lock? Perhaps something like ShareUpdateAnalysisLock? > > The general overhead involved in a whole new lock type is high enough > that I would resist taking this path. (It's certainly a lot more than > adding an entry to one enum someplace --- offhand I can name docs and > grammar as important issues. And no you don't get to have a hidden lock > type that no one can see.) Any other suggestions? > Also, as Alvaro points out, it's far from clear that concurrent VACUUM > and ANALYZE is as safe as you think --- they both want to write the same > fields in pg_class. AFAICS, Alvaro didn't say that at all. At worst, if ANALYZE completed after VACUUM, its stats wouldn't be as good as those set by VACUUM. But, as I said in response to Alvaro, that's no different than running ANALYZE immediately following VACUUM. The case I'm looking at is a large table which requires a lazy vacuum, and a zero vacuum cost delay would cause too much I/O. Yet, this table has enough insert/delete activity during a vacuum, that it requires a fairly frequent analysis to maintain proper plans. I patched as mentioned above and didn't run across any unexpected issues; the only one expected was that mentioned by Alvaro. -Jonah -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
"Marko Kreen" <[EMAIL PROTECTED]> writes: > On 7/21/08, Tom Lane <[EMAIL PROTECTED]> wrote: >> I looked through this a bit, and my principal reaction was "what are >> the security implications? > There are 2 aspects to it: > 1. Function can be created only by superuser. What I'm concerned about is who they can be *called* by. I'd be happier if the default behavior was that there was no public execute privilege for plproxy functions. I think right now that could be enforced by having plproxy's validator procedure replace any null proacl entry with something that explicitly refuses public execute. That's a bit of a hack though. Maybe it'd be worth inventing per-PL default ACLs, instead of having a one-size-fits-all policy? > 2. If cluster connection strings do not have 'user=' key, > ' user=' || current_username() is appended to it. Cool, I missed that. At minimum the documentation has to explain this point and emphasize the security implications. Is it a good idea to allow user= in the cluster strings at all? > Also, plroxy does > _nothing_ with passwords. That means the password for remote > connection must be in postgres user's .pgpass, That seems *exactly* backwards, because putting the password in postgres user's .pgpass is as good as disabling password auth altogether. Consider that it would also hand all the keys to the kingdom over to someone who had access to dblink on the same machine (not even the same cluster, so long as it was run by the same postgres user!). > But I don't think plproxy can and should protect dumb admins who > create remote_exec(sql) function and allow untrusted users to > execute it. We regularly get beat up about any aspect of our security apparatus that isn't "secure by default". This definitely isn't, and from a PR point of view (if nothing else) that doesn't seem a good idea. I repeat that I don't feel comfortable in the least with plproxy's security model. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
On Mon, Jul 21, 2008 at 6:15 PM, Alvaro Herrera <[EMAIL PROTECTED]> wrote: > This means that VACUUM ANALYZE must grab both locks ... is there a > gotcha here? Agreed. > The main problem I see with this idea is that the dead and total tuple > count computed by ANALYZE would be immediately out of date, and if it > happens to finish after VACUUM then it'll overwrite the values the > latter just wrote, which are more correct. Not sure how serious a > problem this is. Agreed, but in the worst case, it's no different than running ANALYZE immediately following a VACUUM. -Jonah -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > Because we wouldn't want multiple ANALYZEs running on the same table, > changing the lock back to an AccessShareLock doesn't sound like a > solution. It flat will not work. We used to do it that way, and it didn't (search for "tuple concurrently updated" in the archives). > However, what are the thoughts around creating another, > more-specific lock? Perhaps something like ShareUpdateAnalysisLock? The general overhead involved in a whole new lock type is high enough that I would resist taking this path. (It's certainly a lot more than adding an entry to one enum someplace --- offhand I can name docs and grammar as important issues. And no you don't get to have a hidden lock type that no one can see.) Also, as Alvaro points out, it's far from clear that concurrent VACUUM and ANALYZE is as safe as you think --- they both want to write the same fields in pg_class. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Load spikes on 8.1.11
Andrew Sullivan <[EMAIL PROTECTED]> writes: > On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote: >> I am aware of the heavy locking involved with Slony, which should mean that >> it blocks the application connections; that's be completely acceptable, >> given all the warnings in the Slony docs. But what I am concerned about and >> trying to hunt down is why backend processes are all consuming up all >> of CPU (!!!) so much so that I am unable to fire up any new process! > Ah, well, then, yes, the spinlock improvements probably will help > you. But you should disabuse yourself of the idea that > processes have no cost. You still have to talk to all those > connections when doing schema changes. Yeah. In fact this is sounding more and more like the known problem with sinval message response causing a "thundering herd" effect: the idle processes all sit idle until the sinval message queue gets long enough to rouse alarm bells, and then they all get signaled at once and all try to clean the message queue at once, leading to very heavy contention for the SInvalLock. That code's been rewritten in CVS HEAD to try to alleviate the problem, but no existing release has the fix. See thread here for prior report: http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
Jonah H. Harris escribió: > Currently, one cannot perform a concurrent VACUUM and ANALYZE. This > is a significant problem for tables which are not only large and have > designated cost-delays, but which are also heavily inserted into and > deleted from. After performing a quick cursory investigation on this, > it doesn't appear to be difficult to change. Other than the > ShareUpdateExclusiveLock, is there anything technically preventing us > from performing both concurrently? This means that VACUUM ANALYZE must grab both locks ... is there a gotcha here? The main problem I see with this idea is that the dead and total tuple count computed by ANALYZE would be immediately out of date, and if it happens to finish after VACUUM then it'll overwrite the values the latter just wrote, which are more correct. Not sure how serious a problem this is. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Load spikes on 8.1.11
On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote: > I am aware of the heavy locking involved with Slony, which should mean that > it blocks the application connections; that's be completely acceptable, > given all the warnings in the Slony docs. But what I am concerned about and > trying to hunt down is why backend processes are all consuming up all > of CPU (!!!) so much so that I am unable to fire up any new process! Ah, well, then, yes, the spinlock improvements probably will help you. But you should disabuse yourself of the idea that processes have no cost. You still have to talk to all those connections when doing schema changes. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Concurrent VACUUM and ANALYZE
Currently, one cannot perform a concurrent VACUUM and ANALYZE. This is a significant problem for tables which are not only large and have designated cost-delays, but which are also heavily inserted into and deleted from. After performing a quick cursory investigation on this, it doesn't appear to be difficult to change. Other than the ShareUpdateExclusiveLock, is there anything technically preventing us from performing both concurrently? Because we wouldn't want multiple ANALYZEs running on the same table, changing the lock back to an AccessShareLock doesn't sound like a solution. However, what are the thoughts around creating another, more-specific lock? Perhaps something like ShareUpdateAnalysisLock? Any other suggestions? -Jonah -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] typedefs for indent
Bruce Momjian wrote: Andrew Dunstan wrote: Alvaro Herrera wrote: Andrew Dunstan wrote: OK, I have spent some time generating and filtering typdefs via objdump on various platforms. I filtered them and Bruce's list to eliminate items not actually found in the sources thus: Did this go anywhere? I'm still trying to get a working objdump for OSX. Automating this is difficult because we need to make sure we get all (or pretty close to all) the typedefs we can get on each platform for various build configurations. At this point I would like to get a typedef list into CVS, even if it is not perfect --- it is better than what we have now. Well, you can start with this one: http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=dungbeetle&dt=2008-07-21%20204856&stg=typedefs After I have a number of buildfarm machines producing them, I'll work on a stored proc to consolidate them and make them available, probably via a SOAP call (c.f. http://people.planetpostgresql.org/andrew/index.php?/archives/14-SOAP-server-for-Buildfarm-dashboard.html ) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Load spikes on 8.1.11
On Tue, Jul 22, 2008 at 1:29 AM, Andrew Sullivan <[EMAIL PROTECTED]> wrote: > On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote: > > > Will try this option, at least in the next schema upgrade or when setting > up > > Slony. > > As I've already suggested, however, if you try to set up slony on a > loaded database, you're going to see all manner of problems. Slony > takes some heavy-duty locks when it does its setup work. It's > designed that you should have an application outage for this sort of > work. Please see previous discussion on the Slony mailing list. Well, a very low activity period of the application (after 11 PM EST) is chosen as the maintenance window. The application is not down, but has just the connections open, and almost all of them sitting . I am aware of the heavy locking involved with Slony, which should mean that it blocks the application connections; that's be completely acceptable, given all the warnings in the Slony docs. But what I am concerned about and trying to hunt down is why backend processes are all consuming up all of CPU (!!!) so much so that I am unable to fire up any new process! Another possible cause we are looking at now is the role Xeon hyperthreading can play here. Will keep you all updated. Thanks and best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com EnterpriseDB http://www.enterprisedb.com Mail sent from my BlackLaptop device
Re: [HACKERS] Load spikes on 8.1.11
On Fri, Jul 18, 2008 at 02:23:43AM -0400, Andrew Sullivan wrote: > On Fri, Jul 18, 2008 at 10:41:36AM +0530, Gurjeet Singh wrote: > > > > Just started INIT cluster Slonik command and that spiked too.. for more than > > 10 minutes now!! > > Are you attempting to do Slony changes (such as install Slony) on an > active database? I strongly encourage you to read the Slony manual. > Slony, frankly, sucks for this use case. The manual says as much, > although in more orotund phrases than that. FWIW, I've had few problems getting londiste up and running on a heavily loaded database. You might need to be a bit careful about when you add very large tables due to the copy overhead, but other than that I haven't had issues. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On 21-Jul-08, at 4:28 PM, Andrew Sullivan wrote: On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote: pgFoundry ain't the CPAN, alas. Maybe that's the problem that really needs solving? One of the big Postgres features is its extensibility. I agree that the extensions can sometimes be hard to find, but surely the answer to that is not an infinitely large source tarball? A I'd have to agree with Andrew here. Making it easy to get extensions would solve lots of problems. Dave -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Jul 21, 2008, at 13:28, Andrew Sullivan wrote: Maybe that's the problem that really needs solving? One of the big Postgres features is its extensibility. I agree that the extensions can sometimes be hard to find, but surely the answer to that is not an infinitely large source tarball? Oh, of course. But one thing at a time. I'm in complete agreement that what goes into core should be pretty conservative, and that the module problem should be addressed. But even given that, I think that there is a strong case to be made that citext should be in contrib. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Jul 21, 2008, at 13:19, Andrew Dunstan wrote: I was going to write some stuff about citext anyway. Quite apart from the above considerations I'm still a bit concerned about its performance characteristics. And I'm not sure we really want all the baggage that David is proposing to bring along with it. Is it an advance to force the regex_replace "i" flag for such a type? I can imagine cases where I might want it to sort insensitively, but be able to do case sensitive regex ops on it. It's not as if the user can't supply the flag. So right now I don't think citext should be included, because there are still issues to sort out, if for no other reason. I'm happy to work with folks to get them figured out, but at the end, there may be some differing opinions. If there's a reference implementation that could be checked (how does a case-insensitive collation work in another database?), that would be fine. You can use the "c" flag to get case-sensitive comparison with the regex functions, though not with the operators. (Maybe this should be moved to a separate thread?) Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote: > pgFoundry ain't the CPAN, alas. Maybe that's the problem that really needs solving? One of the big Postgres features is its extensibility. I agree that the extensions can sometimes be hard to find, but surely the answer to that is not an infinitely large source tarball? A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
On 7/21/08, Tom Lane <[EMAIL PROTECTED]> wrote: > "Marko Kreen" <[EMAIL PROTECTED]> writes: > > [ plproxy ] > > I looked through this a bit, and my principal reaction was "what are > the security implications?" It seems like it'd be very easy to create > functions that allow untrusted users to execute arbitrary SQL on > other databases in the plproxy cluster. As far as I saw there was > no privilege-checking within plproxy itself, you were just relying > on SQL-level permissions checking --- so even though plproxy functions > can only be *created* by superusers, by default they can be *used* by > anybody. So it'd take a great deal of care to avoid making > unintentional security holes. > > I'm not sure about a good solution to this problem, but I think it needs > to be solved before plproxy can be recommended for widespread use. > The first thought that comes to mind is to somehow propagate the userid > on the calling server to the execution on the remote, so that a user > can't get more privilege on the remote than if he were executing there > directly. I'm imagining that the definition of a cluster would include > a map from local to remote userids (and thereby imply that anyone not > explicitly listed can't do remote access at all). There are 2 aspects to it: 1. Function can be created only by superuser. 2. If cluster connection strings do not have 'user=' key, ' user=' || current_username() is appended to it. Note that connections are per-backend, not shared. Also, plroxy does _nothing_ with passwords. That means the password for remote connection must be in postgres user's .pgpass, or there is pooler between plproxy and remote database who handles passwords. What else do you see is needed? I'm not sure a map is a good idea, is seems to create unnecessary coplexity. Ofcourse, it can be done. But I don't think plproxy can and should protect dumb admins who create remote_exec(sql) function and allow untrusted users to execute it. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Tom Lane wrote: The current commitfest queue has two entries that propose to migrate existing pgfoundry projects (or improved versions thereof) into our core distribution. The more I think about this the less happy I am with it. From a maintenance point of view there seems little need for either project to get integrated: they don't appear to have much of any code that is tightly tied to backend innards. From a features point of view, yeah they're cool, but there are scads of cool things out there. From a project-management point of view, it's insanity to set a presumption that pgfoundry is just a proving ground for code that should eventually get into core once it's mature enough or popular enough or whatever. I think there is a case to say that modules that are sufficiently popular have a case to be in core. That's not necessarily a large number, but there might well be a case for citext at least to be among the number at some stage. Surely a case insensitive text type has more general use than, say, the seg module. We *have to* encourage the development of a cloud of subprojects around the core, or core will eventually collapse of its own weight. We have not got the manpower to deal with an ever-inflating collection of allegedly "core" code. If anything, we ought to be working to push more stuff out of the core distro so that we can focus on the functionality that has to be there. When we can get the much discussed module infrastructure that will make more sense. We will still need to keep enough modules to make sure that the infrastructure is working. In general I feel that the number of modules we have in core is about right. Maybe a small number should be pushed out. So my feeling is that we should not accept either of these patches. Now, there is some value in submitting the code for review --- certainly citext is a whole lot better than it was a few weeks ago. I think it would be a good idea to be open to reviewing pgfoundry code with the same standards we'd use if we were going to integrate it. Perhaps commitfest is not the right venue for that, though, if only because of the possibility of confusion over what's supposed to happen. Comments? If we don't have enough resources to maintain them do we have enough to review them? I was going to write some stuff about citext anyway. Quite apart from the above considerations I'm still a bit concerned about its performance characteristics. And I'm not sure we really want all the baggage that David is proposing to bring along with it. Is it an advance to force the regex_replace "i" flag for such a type? I can imagine cases where I might want it to sort insensitively, but be able to do case sensitive regex ops on it. It's not as if the user can't supply the flag. So right now I don't think citext should be included, because there are still issues to sort out, if for no other reason. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Josh Berkus <[EMAIL PROTECTED]> writes: > So I would argue "maybe" on pl/proxy, but that citext does belong in core. Well, at least citext is pretty tiny ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Jul 21, 2008, at 12:53, Josh Berkus wrote: In the specific cases of pl/proxy and citext, they are very much in line with what we already package with the core code, including things like dblink, ISN, and CIDR. citext in particular would eliminate a long-time newbie complaint about Postgres, but not if it's in an add-in package which the user can't find binaries for. So I would argue "maybe" on pl/proxy, but that citext does belong in core. This is my view, as well. If it was in contrib, it'd go a long way toward addressing a commonly-requested feature, whereas things are much more difficult to find on pgFoundry. pgFoundry ain't the CPAN, alas. Even if users do find it in pgFoundry, the fact that it isn't in core is more likely to be seen as a red flag at this point. One might ask, why isn't it in core? What's wrong with it? Why is something that seems so useful relegated to pgFoundry? What's the usual quality of code on pgFoundry? Thanks for your consideration! Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Do we really want to migrate plproxy and citext into PG core distribution?
On Mon, 2008-07-21 at 17:03 -0300, Tom Lane wrote: > [. . .] I think it > would be a good idea to be open to reviewing pgfoundry code with the > same standards we'd use if we were going to integrate it. Perhaps > commitfest is not the right venue for that, though, if only because > of the possibility of confusion over what's supposed to happen. I think this would be a great idea. I would be overjoyed to have veil http://pgfoundry.org/projects/veil/ reviewed by postgres developers. __ Marc signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Jul 21, 2008, at 12:43, Tom Lane wrote: From a maintenance point of view there seems little need for either project to get integrated: they don't appear to have much of any code that is tightly tied to backend innards. Well, citext against CVS HEAD is quite different from the other version I maintain for 8.3. The latter copies the str_toloer() code out of formatting.c from CVS and adds a number of includes in order to get things to work the same as against HEAD. I could probably work around this, though, if there was a macro with the version number in it. Now, there is some value in submitting the code for review --- certainly citext is a whole lot better than it was a few weeks ago. Absolutely. I really appreciate the feedback and comments I've received. Thank you! I think it would be a good idea to be open to reviewing pgfoundry code with the same standards we'd use if we were going to integrate it. Perhaps commitfest is not the right venue for that, though, if only because of the possibility of confusion over what's supposed to happen. Comments? I think that this is a very good idea. But you might have trouble motivating people to review code that won't be in core unless it's managed very diligently. An official extended library distribution, as Josh suggests, would probably help with this, as it then becomes a project alongside PostgreSQL that bundles a lot of great add-ons, rather than just leaving all the add-ons to themselves on pgFoundry. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
"Marko Kreen" <[EMAIL PROTECTED]> writes: > [ plproxy ] I looked through this a bit, and my principal reaction was "what are the security implications?" It seems like it'd be very easy to create functions that allow untrusted users to execute arbitrary SQL on other databases in the plproxy cluster. As far as I saw there was no privilege-checking within plproxy itself, you were just relying on SQL-level permissions checking --- so even though plproxy functions can only be *created* by superusers, by default they can be *used* by anybody. So it'd take a great deal of care to avoid making unintentional security holes. I'm not sure about a good solution to this problem, but I think it needs to be solved before plproxy can be recommended for widespread use. The first thought that comes to mind is to somehow propagate the userid on the calling server to the execution on the remote, so that a user can't get more privilege on the remote than if he were executing there directly. I'm imagining that the definition of a cluster would include a map from local to remote userids (and thereby imply that anyone not explicitly listed can't do remote access at all). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: DTrace probes (merged version)
Apologies for the delayed response - vacation, travel, etc got in the way! Zdenek Kotala wrote: I performed review of merged patch from Robert Treat. At first point the patch does not work (SunOS 5.11 snv_86 sun4u sparc SUNW,Sun-Fire-V240) The attached patch fixed the regression test errors. However I went through a code and I have following comments: 1) Naming convention: - Some probes uses "*end", some "*done". It would be good to select one name. Noted. Will use -done per the convention. This change will be included in an updated patch later since I think there are a number of other changes that need to be made. - I prefer to use clog instead of slru in probes name. clog is widely known. I think slru- is okay per your subsequent emails. - It seems to me better to have checkpoint-clog..., checkpoint-subtrans instead of clog-checkpoint. Yes, I was thinking about this too, but the current names are more consistent with the others. For example: buffer-checkpoint, buffer-* xlog-checkpoint, xlog-* - buffer-flush was originally dirty-buffer-write-start. I prefer Robert Lor's naming. Actually, I made this change so the name is consistent with the other buffer-* probes. 2) storage read write probes smgr-read*, smgr-writes probes are in md.c. I personally think it make more sense to put them into smgr.c. Only advantage to have it in md.c is that actual amount of bytes is possible to monitor. The current probes return number of bytes, that's why they are in md.c 3) query rewrite probe There are several probes for query measurement but query rewrite phase is missing. See analyze_and_rewrite or pg_parse_and_rewrite I believe the rewrite time is accounted for in the query-plan probe. Need to double check on this. 4) autovacuum_start Autovacuum_start probe is alone. I propose following probes for completeness: proc-autovacuum-start proc-autovacuum-stop proc-bgwriter-start proc-bgwriter-stop proc-backend-start proc-backend-stop proc-master-start proc-master-stop Saw a number of emails on this. Will get back on this. 5) Need explain of usage: I have some doubts about following probes. Could you please explain usage of them? example dtrace script is welcome - all exec-* probes - mark-dirty, local-mark-dirty Theo/Robert, do you guys have any sample scripts on the probes you added. 6) several comments about placement: I published patch on http://reviewdemo.postgresql.org/r/25/. I added several comments there. 7) SLRU/CLOG SLRU probes could be return more info. For example if page was in buffer or if physical write is not necessary and so on. Yes, more info could be returned if we can identify specific use cases that the new data will enable. -- Robert Lor Sun Microsystems Austin, USA http://sun.com/postgresql Index: src/backend/access/transam/clog.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/clog.c,v retrieving revision 1.46 diff -u -3 -p -r1.46 clog.c --- src/backend/access/transam/clog.c 1 Jan 2008 19:45:46 - 1.46 +++ src/backend/access/transam/clog.c 21 Jul 2008 18:14:48 - @@ -36,6 +36,7 @@ #include "access/slru.h" #include "access/transam.h" #include "postmaster/bgwriter.h" +#include "pg_trace.h" /* * Defines for CLOG page sizes. A page is the same BLCKSZ as is used @@ -323,7 +324,9 @@ void CheckPointCLOG(void) { /* Flush dirty CLOG pages to disk */ + TRACE_POSTGRESQL_CLOG_CHECKPOINT_START(); SimpleLruFlush(ClogCtl, true); + TRACE_POSTGRESQL_CLOG_CHECKPOINT_DONE(); } Index: src/backend/access/transam/multixact.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/multixact.c,v retrieving revision 1.27 diff -u -3 -p -r1.27 multixact.c --- src/backend/access/transam/multixact.c 1 Jan 2008 19:45:46 - 1.27 +++ src/backend/access/transam/multixact.c 21 Jul 2008 18:21:58 - @@ -57,6 +57,7 @@ #include "storage/lmgr.h" #include "utils/memutils.h" #include "storage/procarray.h" +#include "pg_trace.h" /* @@ -1526,6 +1527,8 @@ MultiXactGetCheckptMulti(bool is_shutdow void CheckPointMultiXact(void) { + TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_START(); + /* Flush dirty MultiXact pages to disk */ SimpleLruFlush(MultiXactOffsetCtl, true); SimpleLruFlush(MultiXactMemberCtl, true); @@ -1540,6 +1543,8 @@ CheckPointMultiXact(void) */ if (!InRecovery) TruncateMultiXact(); + + TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE(); } /* Index: src/backend/access/transam/slru.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/slru.c,v retrieving revision 1.44 diff -u -3 -p -r1.44 slru.c --- src/backend/access/transam/slru.c 1 Jan 2008 19:45:48 - 1.44 +
Re: [HACKERS] Load spikes on 8.1.11
On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote: > Will try this option, at least in the next schema upgrade or when setting up > Slony. As I've already suggested, however, if you try to set up slony on a loaded database, you're going to see all manner of problems. Slony takes some heavy-duty locks when it does its setup work. It's designed that you should have an application outage for this sort of work. Please see previous discussion on the Slony mailing list. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Tom, Comments? Well, in the *general* case, I think if we're going to have "first class" pgfoundry projects, then having a unified "official" Kitchen Sink Package will all of these add-ins becomes an imperative priority for 8.4. EDB's recent open sourcing of their installer might help with this. Futher, we would need to come up with some organized way to subject pgFoundry projects to the same level of general scrutiny which core code gets. Or at least close. In the specific cases of pl/proxy and citext, they are very much in line with what we already package with the core code, including things like dblink, ISN, and CIDR. citext in particular would eliminate a long-time newbie complaint about Postgres, but not if it's in an add-in package which the user can't find binaries for. So I would argue "maybe" on pl/proxy, but that citext does belong in core. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
The current commitfest queue has two entries that propose to migrate existing pgfoundry projects (or improved versions thereof) into our core distribution. The more I think about this the less happy I am with it. From a maintenance point of view there seems little need for either project to get integrated: they don't appear to have much of any code that is tightly tied to backend innards. From a features point of view, yeah they're cool, but there are scads of cool things out there. From a project-management point of view, it's insanity to set a presumption that pgfoundry is just a proving ground for code that should eventually get into core once it's mature enough or popular enough or whatever. We *have to* encourage the development of a cloud of subprojects around the core, or core will eventually collapse of its own weight. We have not got the manpower to deal with an ever-inflating collection of allegedly "core" code. If anything, we ought to be working to push more stuff out of the core distro so that we can focus on the functionality that has to be there. So my feeling is that we should not accept either of these patches. Now, there is some value in submitting the code for review --- certainly citext is a whole lot better than it was a few weeks ago. I think it would be a good idea to be open to reviewing pgfoundry code with the same standards we'd use if we were going to integrate it. Perhaps commitfest is not the right venue for that, though, if only because of the possibility of confusion over what's supposed to happen. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: CITEXT 2.0 v4
On Jul 18, 2008, at 09:53, David E. Wheeler wrote: However, if someone with a lot more C and Pg core knowledge wanted to sit down with me for a couple hours next week and help me bang out these functions, that would be great. I'd love to have the implementation be that much more complete. I've implemented fixes for the regexp_* functions and strpos() in pure SQL, like so: CREATE OR REPLACE FUNCTION regexp_matches( citext, citext ) RETURNS TEXT[] AS ' SELECT regexp_matches( $1::text, $2::text, ''i'' ); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_matches( citext, citext, text ) RETURNS TEXT[] AS ' SELECT regexp_matches( $1::text, $2::text, CASE WHEN strpos($3, ''c'') = 0 THEN $3 || ''i'' ELSE $3 END ); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_replace( citext, citext, text ) returns TEXT AS ' SELECT regexp_replace( $1::text, $2::text, $3, ''i''); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_replace( citext, citext, text, text ) returns TEXT AS ' SELECT regexp_replace( $1::text, $2::text, $3, CASE WHEN strpos($4, ''c'') = 0 THEN $4 || ''i'' ELSE $4 END); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_split_to_array( citext, citext ) RETURNS TEXT[] AS ' SELECT regexp_split_to_array( $1::text, $2::text, ''i'' ); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_split_to_array( citext, citext, text ) RETURNS TEXT[] AS ' SELECT regexp_split_to_array( $1::text, $2::text, CASE WHEN strpos($3, ''c'') = 0 THEN $3 || ''i'' ELSE $3 END ); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_split_to_table( citext, citext ) RETURNS SETOF TEXT AS ' SELECT regexp_split_to_table( $1::text, $2::text, ''i'' ); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION regexp_split_to_table( citext, citext, text ) RETURNS SETOF TEXT AS ' SELECT regexp_split_to_table( $1::text, $2::text, CASE WHEN strpos($3, ''c'') = 0 THEN $3 || ''i'' ELSE $3 END ); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE OR REPLACE FUNCTION strpos( citext, citext ) RETURNS INT AS ' SELECT strpos( LOWER( $1::text ), LOWER( $2::text ) ); ' LANGUAGE SQL IMMUTABLE STRICT; Not so bad, though it'd be nice to have C functions that just did these things. Still not case-insensitive are: -- replace() -- split_part() -- translate() So, anyone at OSCON this week want to help me with these? Or to convert the above functions to C? Greg? Bruce? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Default of max_stack_depth and getrlimit
Le Monday 21 July 2008, Heikki Linnakangas a écrit : > Cédric Villemain wrote: > > Le Monday 21 July 2008, Heikki Linnakangas a écrit : > >> I think we should differentiate between "infinite" and "unknown" in the > >> return value of get_stack_depth_limit(), and use max_stack_depth of 2MB > >> in case of infinite, and fall back to the 100kB only in the unknown > >> case. > > > > Why 2MB ? I believed that 3.5MB is the effective good maximum , is that > > too much ? > > 2MB is the value we set max_stack_depth to, unless getrlimit() says that > the actual stack limit is lower than that. > > I believe the 2MB figure is just an arbitrary value, thought to be quite > safe, but also high enough that most people won't need to raise it. > Before we started to use getrlimit(), we used to just default > max_stack_depth=2MB. Ok, thank you for the explanation. > > -- >Heikki Linnakangas >EnterpriseDB http://www.enterprisedb.com -- Cédric Villemain Administrateur de Base de Données Cel: +33 (0)6 74 15 56 53 http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Default of max_stack_depth and getrlimit
Cédric Villemain wrote: Le Monday 21 July 2008, Heikki Linnakangas a écrit : I think we should differentiate between "infinite" and "unknown" in the return value of get_stack_depth_limit(), and use max_stack_depth of 2MB in case of infinite, and fall back to the 100kB only in the unknown case. Why 2MB ? I believed that 3.5MB is the effective good maximum , is that too much ? 2MB is the value we set max_stack_depth to, unless getrlimit() says that the actual stack limit is lower than that. I believe the 2MB figure is just an arbitrary value, thought to be quite safe, but also high enough that most people won't need to raise it. Before we started to use getrlimit(), we used to just default max_stack_depth=2MB. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Default of max_stack_depth and getrlimit
Le Monday 21 July 2008, Heikki Linnakangas a écrit : > In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set > max_stack_depth automatically, to a max of 2MB: > > http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php > > However, it's not behaving as I expected when the stack limit is set to > "unlimited". I would expect max_stack_depth to be set to the max of 2MB, > but instead it gets set to 100kB. > > I don't normally run without a limit, but it looks like the regression > tests run like that with "make check", at least on my platform. I bumped > into this while running a custom regression test with very deep nesting. > > I think we should differentiate between "infinite" and "unknown" in the > return value of get_stack_depth_limit(), and use max_stack_depth of 2MB > in case of infinite, and fall back to the 100kB only in the unknown case. Why 2MB ? I believed that 3.5MB is the effective good maximum , is that too much ? > > -- >Heikki Linnakangas >EnterpriseDB http://www.enterprisedb.com -- Cédric Villemain Administrateur de Base de Données Cel: +33 (0)6 74 15 56 53 http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the
Markus Wanner <[EMAIL PROTECTED]> writes: > However, what about at least adding a comment, so fellow hackers have a > chance of understanding the subtle difference there? Sure, done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Default of max_stack_depth and getrlimit
Tom Lane wrote: "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set max_stack_depth automatically, to a max of 2MB: http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php However, it's not behaving as I expected when the stack limit is set to "unlimited". Is there really any such thing as "unlimited" stack depth? No, but I would think it's safe to assume that "unlimited" is greater than 2MB. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] overlaps performance
Tom Lane pisze: The reason we don't automatically translate OVERLAPS is that the spec's definition of OVERLAPS is too weird for that to work; in particular it demands a true result for some cases in which one of the four endpoints is NULL, which'd be pretty hard to do with an interval-style index. shame, I just work on a thing that would benefit from index that could be used in OVERLAPS. I don't know psql internals , except for how GiST works, hence my question. thanks for the answer. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Default of max_stack_depth and getrlimit
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set > max_stack_depth automatically, to a max of 2MB: > http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php > However, it's not behaving as I expected when the stack limit is set to > "unlimited". Is there really any such thing as "unlimited" stack depth? I think that treating that as "unknown" is a good conservative thing to do, because it is surely a lie. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the
Hi, Tom Lane wrote: This seems like a bad idea, because it makes the code gratuitously different from the names used for this purpose everywhere else. I find that a pretty dubious reason for having 'query_string' and 'queryString' in the same function. In fact, having it in the same code base seems strange. It makes me wish we had (better!) naming conventions... Something I've stumbled over often enough during my work with Postgres - What was it again: 'query_string' (87 times), 'queryString' (77 times) or 'querystr' (42 times)? However, what about at least adding a comment, so fellow hackers have a chance of understanding the subtle difference there? Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] overlaps performance
Gregory Stark <[EMAIL PROTECTED]> writes: > Grzegorz JaÅkiewicz <[EMAIL PROTECTED]> writes: >> So my question would be, why isn't postgresql using indexes for OVERLAPS, >> and why optimizer doesn't substitute it with something like: >> >> (c <= a AND d > a) OR ( c >= a AND c < b) > How would you use an index for that? I believe you can index overlaps-like tests using GIST on an interval-like data type --- look at contrib/seg for an example. The reason we don't automatically translate OVERLAPS is that the spec's definition of OVERLAPS is too weird for that to work; in particular it demands a true result for some cases in which one of the four endpoints is NULL, which'd be pretty hard to do with an interval-style index. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECUSIVE patches 0721
2008/7/21 Tatsuo Ishii <[EMAIL PROTECTED]>: >> Hi, >> >> Tatsuo Ishii wrote: >> > CVS HEAD won't compile for me >> >> Did you try 'make clean' and rebuild? (Or even distclean). > > Actually what I did was do cvs update on 2007/07/17 CVS HEAD (I kept > freshly gotten CVS work files as a tar ball). > >> There were >> some changes to pg_proc. At least, that problem has biten me this >> morning, but after that, I could compile HEAD just fine. > > I got following: > > functioncmds.c:232: error: `FUNC_PARAM_TABLE' undeclared (first use in this > function) > functioncmds.c:232: error: (Each undeclared identifier is reported only once > functioncmds.c:232: error: for each function it appears in.) it is new symbol from table function patch regards Pavel Stehule > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the
Markus Wanner <[EMAIL PROTECTED]> writes: > This commit added a variable 'query_string' to the function > ExecuteQuery() in src/backend/commands/prepare.c, but that function > already takes an argument named 'queryString'. What's the difference? > Which is which? Do we need both? The query_string variable is the original PREPARE's query_string copied into the portal's context, which we do to ensure that it lives as long as the portal does. There's no guarantee that the CachedPlanSource will survive that long (there could be a DEALLOCATE while the query executes). The one passed in is the query string for the EXECUTE statement. I think it's just used for error reporting in EvaluateParams. > I propose renaming the variable (as in the attached patch) or at least > explaining it better in additional comments. This seems like a bad idea, because it makes the code gratuitously different from the names used for this purpose everywhere else. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] overlaps performance
Gregory Stark pisze: Grzegorz Jaśkiewicz <[EMAIL PROTECTED]> writes: So my question would be, why isn't postgresql using indexes for OVERLAPS, and why optimizer doesn't substitute it with something like: (c <= a AND d > a) OR ( c >= a AND c < b) How would you use an index for that? check Depesz'es article for that. I included it at the bottom of my email. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] WITH RECUSIVE patches 0721
> Hi, > > Tatsuo Ishii wrote: > > CVS HEAD won't compile for me > > Did you try 'make clean' and rebuild? (Or even distclean). Actually what I did was do cvs update on 2007/07/17 CVS HEAD (I kept freshly gotten CVS work files as a tar ball). > There were > some changes to pg_proc. At least, that problem has biten me this > morning, but after that, I could compile HEAD just fine. I got following: functioncmds.c:232: error: `FUNC_PARAM_TABLE' undeclared (first use in this function) functioncmds.c:232: error: (Each undeclared identifier is reported only once functioncmds.c:232: error: for each function it appears in.) -- Tatsuo Ishii SRA OSS, Inc. Japan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECUSIVE patches 0721
Hi, Tatsuo Ishii wrote: CVS HEAD won't compile for me Did you try 'make clean' and rebuild? (Or even distclean). There were some changes to pg_proc. At least, that problem has biten me this morning, but after that, I could compile HEAD just fine. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WITH RECUSIVE patches 0721
Hi, Here is the lastest WITH RECURSIVE patches against 2007/07/17 CVS (CVS HEAD won't compile for me). This version includes regression tests and is almost ready for commit IMO. -- Tatsuo Ishii SRA OSS, Inc. Japan recursive_query.patch.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] overlaps performance
Grzegorz Jaśkiewicz <[EMAIL PROTECTED]> writes: > So my question would be, why isn't postgresql using indexes for OVERLAPS, > and why optimizer doesn't substitute it with something like: > > (c <= a AND d > a) OR ( c >= a AND c < b) How would you use an index for that? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] overlaps performance
Hey guys, I am asking here, because I know there is bunch of people here that know the topic very well. I need to use few 'overlaps' for timedate in my query. And I guess it is quite inefficient there. So my question would be, why isn't postgresql using indexes for OVERLAPS, and why optimizer doesn't substitute it with something like: (c <= a AND d > a) OR ( c >= a AND c < b) instead of (a,b) overlaps (c,d) any corner cases, or particular reasons ? (source of example) http://www.depesz.com/index.php/2007/11/21/find-overlapping-time-ranges/ thanks. signature.asc Description: OpenPGP digital signature
[HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the
Hi, Tom Lane wrote: Adjust things so that the query_string of a cached plan and the sourceText of a portal are never NULL, but reliably provide the source text of the query. It turns out that there was only one place that was really taking a short-cut, which was the 'EXECUTE' utility statement. That doesn't seem like a sufficiently critical performance hotspot to justify not offering a guarantee of validity of the portal source text This commit added a variable 'query_string' to the function ExecuteQuery() in src/backend/commands/prepare.c, but that function already takes an argument named 'queryString'. What's the difference? Which is which? Do we need both? It looks like the second is the query string of the prepare statement, where the string passed as an argument contains the EXECUTE command. I propose renaming the variable (as in the attached patch) or at least explaining it better in additional comments. Sorry, if this is nitpicking. I just happened to stumbled over it and thought I better tell you. Regards Markus *** src/backend/commands/prepare.c 1f53747076d3cb8d83832179c2e8a0ee3d8f2d37 --- src/backend/commands/prepare.c 0b2ffacdca58b5a073fe6a57b3aa2c7d61d317a4 *** ExecuteQuery(ExecuteStmt *stmt, const ch *** 174,180 ParamListInfo paramLI = NULL; EState *estate = NULL; Portal portal; ! char *query_string; /* Look it up in the hash table */ entry = FetchPreparedStatement(stmt->name, true); --- 174,180 ParamListInfo paramLI = NULL; EState *estate = NULL; Portal portal; ! char *prepared_qs; /* Look it up in the hash table */ entry = FetchPreparedStatement(stmt->name, true); *** ExecuteQuery(ExecuteStmt *stmt, const ch *** 205,212 portal->visible = false; /* Copy the plan's saved query string into the portal's memory */ ! query_string = MemoryContextStrdup(PortalGetHeapMemory(portal), ! entry->plansource->query_string); /* * For CREATE TABLE / AS EXECUTE, we must make a copy of the stored query --- 205,212 portal->visible = false; /* Copy the plan's saved query string into the portal's memory */ ! prepared_qs = MemoryContextStrdup(PortalGetHeapMemory(portal), ! entry->plansource->query_string); /* * For CREATE TABLE / AS EXECUTE, we must make a copy of the stored query *** ExecuteQuery(ExecuteStmt *stmt, const ch *** 256,262 PortalDefineQuery(portal, NULL, ! query_string, entry->plansource->commandTag, plan_list, cplan); --- 256,262 PortalDefineQuery(portal, NULL, ! prepared_qs, entry->plansource->commandTag, plan_list, cplan); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Default of max_stack_depth and getrlimit
In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set max_stack_depth automatically, to a max of 2MB: http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php However, it's not behaving as I expected when the stack limit is set to "unlimited". I would expect max_stack_depth to be set to the max of 2MB, but instead it gets set to 100kB. I don't normally run without a limit, but it looks like the regression tests run like that with "make check", at least on my platform. I bumped into this while running a custom regression test with very deep nesting. I think we should differentiate between "infinite" and "unknown" in the return value of get_stack_depth_limit(), and use max_stack_depth of 2MB in case of infinite, and fall back to the 100kB only in the unknown case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Hi, Alvaro Herrera wrote: Markus Wanner wrote: (Although, I'm still less than thrilled about the internal storage format of these tuple collections. That can certainly be improved and simplified.) Care to expand more on what it is? Well, what I really dislike is the overhead in code to first transform tuples into a string based internal change set representation, which then gets serialized again. That looks like two conversion steps, which are both prone to error. I'm about to merge those into a simpler tuple serializer, which shares code with the initializer (or recovery provider/subscriber) part. This is where I'd like to know what requirements Jan or others have. I will try to outline the current implementation and requirements of Postgres-R in a new thread soon. On Replicator we're using the binary send/recv routines to transmit tuples. (Obviously this fails when the master and slave have differing binary output, but currently we just punt on this point). Yeah, that's another point. I'm currently using the textual input/output functions, but would like to switch to the binary one as well. However, that's an optimization, where the above is simplification of code, which is more important to me at the moment. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers