Re: [PATCHES] Sun Studio on Linux spinlock patch
Julius Stroffek [EMAIL PROTECTED] writes: I have made PostgreSQL to compile on linux using sun studio with spinlock support. The patch is attached. Here is the explanation of changes I made: This patch seems broken in a number of ways. Why are you removing -DLINUX_PROFILE, for example? Are you sure you don't need -D_GNU_SOURCE? And why add -DSUNOS4_CC, which is a Solaris-specific define (not that we seem to be using it anywhere anymore)? Do we really have to have a configure-time probe to detect this particular compiler? But I guess the *real* question is why anyone would care ... what benefit is there to using Sun's compiler on Linux? regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] NetBSD/MIPS supports dlopen
Alvaro Herrera [EMAIL PROTECTED] writes: Rémi Zara wrote: Recent version of NetBSD/MIPS support dlopen. This patch makes the netbsd dynloader tests availability of dlopen on HAVE_DLOPEN rather than on __mips__ Applied, thanks. Weird, I haven't seen the commit message come through here. Anyway: (1) this should probably be back-patched; (2) please clean up the ugly double negative here: ! #if !defined(HAVE_DLOPEN) #else dlclose(handle); #endif regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] NetBSD/MIPS supports dlopen
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Weird, I haven't seen the commit message come through here. Yeah, that's strange -- the (2) commit message got to me, but this one hasn't. Moderation filter got it for some reason? None of the back-patch commits came through either, so there's something going on there... regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] WIP: guc enums
Heikki Linnakangas [EMAIL PROTECTED] writes: Magnus Hagander wrote: On my platform (linux x86) it works fine when I just cast this to (int *), but I'm unsure if that's going to be safe on other platforms. I had some indication that it's probably not? No, I don't think that's safe. Some googleing (*) suggests that the compiler is free to choose any integer type for an enum. Yeah, it's absolutely not safe. What I'd suggest is declaring the actual variable as int. You can still use an enum typedef to declare the values, and just avert your eyes when you have to cast the enum to int or vice versa. (This is legal per C spec, so you won't introduce any portability issues when you do it.) regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] libpq.so linking problem on Solaris using --with-gssapi
Bjorn Munch [EMAIL PROTECTED] writes: The fix is simply to add -lgss to the list, as the attached patch does. I'm pretty sure no other Makefiles are affected. Applied, thanks. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] [BUGS] BUG #3973: pg_dump using inherited tables do not always restore
Alex Hunsaker [EMAIL PROTECTED] writes: On Wed, Feb 20, 2008 at 3:55 PM, Tom Lane [EMAIL PROTECTED] wrote: Actually the bug is that ALTER TABLE allows you to do that. It should not be possible to drop an inherited constraint, but right now there's not enough information in the system catalogs to detect the situation. Fixing this has been on the TODO list for awhile: Hrm how about something like the attached patch? It seems much more restrictive than necessary, plus it does nothing for the check-constraint case. My recollection of the previous discussion about how to fix this was that we needed to add an inhcount column to pg_constraint, and add entries for not-null constraints (at least inherited ones) to pg_constraint so that they'd be able to have inhcount fields. The latter would also allow us to attach names to not-null constraints, which I think is required by spec but we've never supported. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your Subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] DTrace probe patch for OS X Leopard
Robert Lor [EMAIL PROTECTED] writes: Peter Eisentraut wrote: I understand that these probe names follow some global naming scheme. Is it hard to change that? I'd feel more comfortable with, say, (D)TRACE_POSTGRESQL_LWLOCK_ACQUIRE(). Because the macro is auto generated and follows certain naming conventions, prepending TRACE_ will not work. If you did that, the probe name will be called postgresql-lwlock-aquire and the provider will be trace which is not what we want. Ugh. Is this tool really so badly designed that it thinks it has ownership of the *entire* namespace within the target program? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] remove TCL_ARRAYS
Alvaro Herrera [EMAIL PROTECTED] writes: This patch removes the TCL_ARRAY symbol. If you're going to remove it you should actually remove it (eg from pg_config_manual.h). This seems to be a leftover from when pgtcl was around in the backend; Yeah, it was supporting some kluge or other in the libpgtcl client. AFAICT from the 7.x code, the client-side kluge was never enabled by default anyway, so it seems unlikely anyone will miss it. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] DTrace probe patch for OS X Leopard
Alvaro Herrera [EMAIL PROTECTED] writes: Robert Lor wrote: probes.h is auto generated and it can certainly be massaged to include the above logic, but I'd like to avoid doing that if possible. Hmm, so let's have a third file that's not autogenerated, which is the file we will use for #includes, and contains just that block. Or just two files. Call probes_null something else, have it be included where needed, have it include the autogenerated file when appropriate. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] DTrace probe patch for OS X Leopard
Robert Lor [EMAIL PROTECTED] writes: I haven't heard any major disadvantages about keeping it in c.h, but if you are still adamant about keeping it out of c.h, I'll will go along with that. I was thinking that pg_trace.h involved some backend-only code, but I'm not sure why I thought that :-(. Yeah, your plan to do it by restructuring the contents of pg_trace.h sounds fine. We still have what I consider a big problem with the names of the macros. Perhaps that could be fixed by passing the auto-generated file through a sed script to put a prefix on the macro names before we start to use it? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] DTrace probe patch for OS X Leopard
Robert Lor [EMAIL PROTECTED] writes: And don't think adding a simple comment before the macro call is sufficient? This can be documented so everyone knows the convention. It's stupid. The need for a comment is proof that the macro is badly named. I don't intend to hold still for letting poorly-designed tools dictate our coding conventions. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] DTrace probe patch for OS X Leopard
Robert Lor [EMAIL PROTECTED] writes: Peter Eisentraut wrote: Is c.h the right place to include this? The probes are only in the backend code, so perhaps postgres.h would be more appropriate. Or just include it in the .c files that need it, of which there are only 3. I think putting probes.h in c.h is the right place. It's true that the probes are only in the backend now and only in a few files, but in the future I can foresee probes added to more files in the backend, the procedural language code or any of the commands (initdb, psql, etc). I agree with Peter. There are a whole lot of include files that are needed by way more than 3 .c files, and yet are not folded into postgres.h. c.h is right out. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] tuplestore_putvalues()
Neil Conway [EMAIL PROTECTED] writes: Attached is a patch that allows an array of Datums + nulls to be inserted into a tuplestore without first creating a HeapTuple, per recent suggestion on -hackers. This avoids making an unnecessary copy. A small thought here: we were jousting recently over a point that came down to whether or not tuplestore kept track of the tupdesc for the tuples it was storing. I can hardly imagine a use-case for a tuplestore in which the tuples don't all have the same tupdesc. I think I dropped tupdesc from tuplestore's original API on the grounds that it wasn't doing anything much with the tupdesc. But now this patch adds back a tuplestore API call that needs the tupdesc. Would it be saner to supply the tupdesc to tuplestore_begin_heap instead, as tuplesort does? I haven't looked at all into what the implications of this would be, either from a performance or number-of-places-to-change standpoint. But it seems worth a bit of investigation while we're touching the code. Other than that issue, the patch seems OK in a quick once-over. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] SRF memory leaks
Neil Conway [EMAIL PROTECTED] writes: On Tue, 2008-02-26 at 00:17 -0800, Neil Conway wrote: You didn't comment on my proposed solution (FreeTupleDesc() iff refcount == -1). I still find that entirely unsafe, particularly for something you propose to back-patch into stable branches. Negative refcount does not prove that the SRF itself hasn't still got a pointer to the tupdesc. Can't we fix it so that the tupdesc is allocated in the new special context (at least in the primary code paths), and then no explicit free is needed? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] SRF memory leaks
Neil Conway [EMAIL PROTECTED] writes: On Wed, 2008-02-27 at 15:07 -0500, Tom Lane wrote: Negative refcount does not prove that the SRF itself hasn't still got a pointer to the tupdesc. That sounds quite bizarre. The SRF has already finished execution at this point, so keeping a pointer to the tupledesc around would only make sense if you wanted to use that tupledesc on a *subsequent* invocation of the SRF. Hmm ... actually I was worried about it being embedded in the returned tuplestore, but I see tuplestore doesn't currently use a tupdesc at all, so maybe it isn't that big a problem. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] SRF memory leaks
Neil Conway [EMAIL PROTECTED] writes: On Mon, 2008-02-25 at 21:00 -0500, Tom Lane wrote: I find this part of the patch to be a seriously bad idea. AFAICS this is not true of any of the SRFs in the backend, which always return expendable tupdescs. It's OK in the built-in SRFs is disastrously different from It's OK. It was never specified that SRFs had to return a free-able tupdesc, so I think it's a lead pipe cinch that there are some out there that don't. Nor would it be their fault if we change the specification. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] lc_time and localized dates
Euler Taveira de Oliveira [EMAIL PROTECTED] writes: FYI, strftime() [1] doesn't say anything about capitalization. I don't know if some translators are aware of it and even if they are, how would they know that a day or month is the first word of a sentence? IMHO we should provide X, x, and X so the programmer can use whatever he/she thinks is correct for him/her. I think you've missed the point, which is that the programmer can't possibly know in advance which capitalization is correct, if she's trying to build a system that works for different localizations. But as Peter remarks nearby, this discussion is wasted anyway, because there is only one correct answer: whatever Oracle does with these cases is what to_char() should do. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Bulk Insert tuning
Simon Riggs [EMAIL PROTECTED] writes: Following patch implements a simple mechanism to keep a buffer pinned while we are bulk loading. This will fail to clean up nicely after a subtransaction abort, no? (For that matter I don't think it's right even for a top-level abort.) And I'm pretty sure it will trash your table entirely if someone inserts into another relation while a bulk insert is happening. (Not at all impossible, think of triggers for instance.) From a code structural point of view, we are already well past the number of distinct options that heap_insert ought to have. I was thinking the other day that bulk inserts ought to use a ring-buffer strategy to avoid having COPY IN trash the whole buffer arena, just as we've taught COPY OUT not to. So maybe a better idea is to generalize BufferAccessStrategy to be able to handle write as well as read concerns; or have two versions of it, one for writing and one for reading. In any case the point being to encapsulate all these random little options in a struct, which could also carry along state that needs to be saved across a series of inserts, such as the last pinned buffer. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Fix pgstatindex using for large indexes
Zdenek Kotala [EMAIL PROTECTED] writes: Tom Lane napsal(a): Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... Is this requirement still valid? Yes. Is there any currently supported platform which does not have uint64? I don't know, and neither do you. IIRC we are going to change datetime to integer for 8.4. We are going to change the *default* to integer. It will still be possible to select float datetimes for use on platforms without working int64. There is not any core functionality that will fail completely without int64, and none will be introduced anytime soon if I have anything to say about it. Now having said that, there are places that use int64 with the understanding that it might degrade to int32, and that that will be good enough --- the statistics counters are an example. However, the only point of the patch being presented for pgstatindex is to be able to count higher than 2^32, so ISTM we may as well make it use double. There isn't any particular reason it *shouldn't* be coded that way, AFAICS, and there is precedent in that VACUUM/ANALYZE use doubles for similar purposes. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Fix pgstatindex using for large indexes
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: Is there any currently supported platform which does not have uint64? I don't know, and neither do you. Maybe we should look at some reasonable way of getting the info out of a compiled instance. How about if we get pg_config to output the value of INT64_IS_BUSTED? We know all the buildfarm machines have working int64, because they'd fail the bigint regression test if not. That's not the point here. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Avahi support for Postgresql
Mathias Hasselmann [EMAIL PROTECTED] writes: Oh, that difference is really interesting. I didn't even see it. DNS-SD uses the convention _protoname._type to describe services, and that's the convention Avahi follows. I don't know why the Bonjour API uses that trailing dot, but it seems wrong: Is removing that dot going to create compatibility problems for us? If it means that an 8.3 client wouldn't find an 8.4 server, for instance, I don't think that's going to be acceptable. Do we need to listen on both names (whatever the heck that means in zeroconf)? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Fix pgstatindex using for large indexes
Florian G. Pflug [EMAIL PROTECTED] writes: Maybe we should just bite the bullet, and implement int64 emulation for platforms that don't provide one? Why? Workarounds such as use double where needed have served us perfectly fine so far, with far less effort and notational ugliness than this would involve. There will come a time where either there's a really good reason to rely on int64, or we feel that it's moot because any platform without int64 is certainly dead anyway. I'm not sure how far off that time is, but it's probably some fairly small number of years. My position is simply that pgstattuple does not present a reason to make that decision today, especially not when making it rely on int64 is at variance with the coding method already in use in related parts of the core backend. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] SRF memory leaks
Neil Conway [EMAIL PROTECTED] writes: if (funcTupdesc) + { tupledesc_match(node-tupdesc, funcTupdesc); + FreeTupleDesc(funcTupdesc); + } } I find this part of the patch to be a seriously bad idea. nodeFunctionscan has no right to assume that the function has returned an expendable tupdesc; indeed, I would think that the other case is more nearly what's expected by the API for SRFs. We certainly daren't backpatch such a change. A safer fix would be to try to make the tupdesc be in the new multi_call_ctx when it's being created by the funcapi.c code. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Shlib exports file refactoring
Peter Eisentraut [EMAIL PROTECTED] writes: After seeing four nearly-identical copies of multiplatform shared library exports file generation code, I figured this should be put in a common place. If you like, please test the attached patch on darwin and win32, since these platforms are mostly affected (besides linux) and I can't test them. Just see if it builds and runs correctly. Thanks. [ shrug... ] Apply it to HEAD and see if the buildfarm complains. We're in devel cycle now, so transient breakage isn't going to be a big problem. If you were asking for tests that the buildfarm wouldn't make, I might think differently ... but then you'd need to be giving some more detailed directions than these. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] fix in --help output
Zdenek Kotala [EMAIL PROTECTED] writes: I attached patch which replaces any --... occurrence with -c... on command line. Please see whether forcibly using src/port/getopt.c fixes this, instead. A saner patch would probably add something like this to configure.in: if test $PORTNAME = solaris; then AC_LIBOBJ(getopt) AC_LIBOBJ(getopt_long) fi regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Fix pgstatindex using for large indexes
Tatsuhito Kasahara [EMAIL PROTECTED] writes: In pgstatindex.c and pgstattuple.sql, some variables are defined with int type. So when we try to get informations about a large index by using pgstatindex, we get strange value of size and density. Because the values exceed int-max. ... I think that max_avail and free_space should be uint64. Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] fix in --help output
Zdenek Kotala [EMAIL PROTECTED] writes: I attach fix for --help output. I replaced --NAME... with -NAME and add some example. getopt parse -- as a end of options and rest of line is not parsed. Surely this is outright wrong. Or if you do have a getopt that acts that way, the bug is that we are using it rather than one that acts the way we want. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] --describe-config crashes
Zdenek Kotala [EMAIL PROTECTED] writes: Function printMixedStruct calls printf with NULL argument. It causes segmentation fault. Please, apply it for 8.3 - 8.1 too. Ugh, I guess we've only tested --describe-config on platforms where %s treats a NULL pointer the same as an empty string. Since that's also effectively the way GUC handles it, I'm inclined to think we should print NULL as not (NULL). Otherwise, will apply. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [BUGS] BUG #3969: pg_ctl cannot detect server startup
ITAGAKI Takahiro [EMAIL PROTECTED] writes: I found this bug comes from the definition of WHITESPACE characters in pg_ctl.c. WHITESPACE is defined as folows: #define WHITESPACE \f\n\r\t\v In fact, WHITESPACE does not contain whilespace (0x20) :-( Ooops :-( I attach a patch to fix it. Actually, this coding seems gratuitously ugly/inconsistent/fragile, so I'm inclined to rewrite it to eliminate WHITESPACE altogether. It seems bad style to switch between loops using isspace() and an entirely different type of string searching that (as demonstrated by the bug) isn't easy to keep in sync. Adding an additional loop of the same kind to scan over the parameter value would take a couple more lines, but I think it'll be a lot more readable. In fact there are more bugs here: it won't deal correctly with a quoted port value, and it'd be fooled by '-p' appearing in the argument value for another option type. I'm not sure how tense we should be about getting it to exactly match the backend's behavior for corner cases, but at the very least it probably shouldn't be fooled by -p appearing inside quotes. BTW, I also found similar definitions in some places. (Please grep with \t\n\r.) They are a bit different from each other. For example, whitespaces is defined as \t\n\r in tzparser.c. Is it ok in the inconsistency? Or, should we always use \f\n\r\t\v ? Not sure. One point is that vertical whitespace shouldn't necessarily be treated the same as horizontal whitespace. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Proposed patch to change TOAST compression strategy
Teodor Sigaev [EMAIL PROTECTED] writes: I have not done any performance testing of these changes --- does anyone have specific test scenarios they'd like to try? That's very important change for text search, because of tsvector storage. Headline and rank reads a lot of tsvectors. It seems to me that ranking test will be very clear: rank function reads whole tsvector and returns small amount of data (just a number). I have no test data sets with large tsvector fields. If you do, could you try it with this patch and see what the results are like? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] tzcode update
Magnus Hagander [EMAIL PROTECTED] writes: On Mon, Feb 18, 2008 at 04:32:58PM +0100, Bjorn Munch wrote: Ouch! This fails on our Solaris builds, because we build with the Solaris timezone files. And these apparently don't work beyond 2038 and don't know that Finland plans to have DST also in the summer of 2050... I haven't studied this in depth but I'm afraid the answer is fix your own timezone files? Or use the ones shipping with pg ;-) Yeah. I included the far-future cases in the committed patch specifically to make it obvious if you were using tz files that lacked post-2038 support. I can't imagine that fixing this isn't pretty darn high on the Solaris to-do list, anyway. Financial apps doing, say, 30-year mortgage projections are broken *today* on platforms without post-Y2038 calendar support. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Proposed patch to change TOAST compression strategy
This proposed patch addresses some issues in TOAST compression strategy that were discussed last year, but we felt it was too late in the 8.3 cycle to change the code immediately. See these threads for background: http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php http://archives.postgresql.org/pgsql-general/2007-08/msg01129.php Specifically, the patch: * Reduces the minimum datum size to be considered for compression from 256 to 32 bytes, as suggested by Greg Stark. (Greg later suggested dropping the threshold clear to zero, but this is pointless since we must save enough bytes to pay for the longer header.) * Increases the required compression rate for compressed storage from 20% to 25%, again per Greg's suggestion. * Replaces force_input_size (size above which compression is forced) with a maximum size to be considered for compression. It was agreed that allowing large inputs to escape the minimum-compression-rate requirement was not bright, and that indeed we'd rather have a knob that acted in the other direction. I set this value to 1MB for the moment, but it could use some performance studies to tune it. * Adds an early-failure path to the compressor as suggested by Jan: if it's been unable to find even one compressible substring in the first 1KB (parameterizable), assume we're looking at incompressible input and give up. (There are various ways this could be done, but this way seems to add the least overhead to the compressor's inner loop.) * Improves the toasting heuristics so that when we have very large fields with attstorage 'x' or 'e', we will push those out to toast storage before considering inline compression of shorter fields. This also responds to a suggestion of Greg's, though my original proposal for a solution was a bit off base because it didn't fix the problem for large 'e' fields. There was some discussion in the earlier threads of exposing some of the compression knobs to users, perhaps even on a per-column basis. I have not done anything about that here. It seems to me that if we are changing around the parameters, we'd better get some experience and be sure we are happy with the design before we set things in stone by providing user-visible knobs. I have not done any performance testing of these changes --- does anyone have specific test scenarios they'd like to try? regards, tom lane bin5xGxvAMBU7.bin Description: toast-strategy.patch ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] tzcode update
Heikki Linnakangas [EMAIL PROTECTED] writes: Heikki Linnakangas wrote: I'll add some tests to cover timestamps 2038. Attached is a new patch, with a couple of new regression tests. No other changes. Applied with minor revisions --- I found a couple of portability problems while testing here. int64_fast_t is already defined in some system headers and causes a conflict, so I just made it be int64 instead. Also there were undefined references to INT32_MIN and INT32_MAX, which could possibly have been fixed with more #include's, but I thought replacing them with a cast-based overflow check was at least as good. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)
Alvaro Herrera [EMAIL PROTECTED] writes: Ideally, it should be an error to set lc_messages to a value that's not compatible with the current encoding. Do we do that currently elsewhere? We don't currently enforce that, and I'm not sure it's possible to do so on non-Windows machines. AFAIR the POSIX API doesn't even associate a character set with anything but LC_CTYPE. Does it make any sense to wire in the assumption that locale names have the form something.encoding-id? If we did that, we could enforce that the encoding-id part matches LC_CTYPE, or maybe even just alter the presented values for other LC_foo variables to match. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] tzcode update
Heikki Linnakangas [EMAIL PROTECTED] writes: Looking closer, I don't understand how that change was supposed to do anything. The point of that patch is to avoid an off-by-one result for years BC. The direction of rounding in integer division with a negative numerator is undefined in C (or at least used to be --- did C99 tighten this up?). regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Endless recovery
Heikki Linnakangas [EMAIL PROTECTED] writes: Hans-Juergen Schoenig wrote: that's where it finished, nothing else was logged between the redo done and the last log messages I bet you've bumped into a bug in gist redo code, the cleanup phase shouldn't take long. That's what it smells like to me too. Any chance that you still have the WAL log for examination? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Proposed patch for 8.3 VACUUM FULL crash
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: On investigation the problem occurs because we changed vacuum.c's PageGetFreeSpaceWithFillFactor() to use PageGetHeapFreeSpace() instead of just computing pd_upper - pd_lower as it had done in every previous release. This was *not* a good idea: VACUUM FULL does its own accounting for line pointers and does not need help. Fwiw this change appears to have crept in when the patch was merged. Yeah, it's entirely likely that this was my fault :-(. It wasn't apparent at the time that this change wasn't safe and conservative... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] tzcode update
Heikki Linnakangas [EMAIL PROTECTED] writes: I was not able to find anything like release notes that would list the differences between tzcode2003e, which I believe is the version that we included back then, and the latest version tzcode2007k. So I just took a diff between those, and deduced from there what has changed. Oh good, this has been on my to-do list for awhile ... but I'm happy to let you do it ;-) I don't really know how to test this. It passes the regression tests, after the fixes to pg_dst_next_boundary, but I was expecting there to be some failures now that we support timezones for timestamps outside the 32-bit time_t range. Apparently our tests don't cover that? Unless the 64-bit extension changed the semantics a lot more than I think, any given compiled tzdata file covers only a finite range of years. The extension makes it *possible* for the data to extend outside the time_t range, but you won't actually see a difference in behavior unless (a) you do extend the range (what's zic's default now?) and (b) you test a date falling within the extended range. So I'm not too surprised that there are no cases in the regression tests that notice. We should probably add some reaching out to 2100 or so. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PATCHES] Proposed patch for 8.3 VACUUM FULL crash
Tomas Szepe reported here http://archives.postgresql.org/pgsql-bugs/2008-02/msg00068.php about a bug in VACUUM FULL, which turned out to be that the code was expecting pages with no live tuples to always get added to the fraged_pages list, and this was sometimes not happening. On investigation the problem occurs because we changed vacuum.c's PageGetFreeSpaceWithFillFactor() to use PageGetHeapFreeSpace() instead of just computing pd_upper - pd_lower as it had done in every previous release. This was *not* a good idea: VACUUM FULL does its own accounting for line pointers and does not need help. Aside from exposing the aforementioned bug, this would result in pages sometimes being passed over as not being useful insertion targets, when in fact they were going to be completely empty after the reaping step. The attached proposed patch reverts that bad decision, and also adds an extra condition to force pages into the fraged_pages list when notup is true. Under normal circumstances this extra condition should be a useless test, but I am worried that with extremely small fillfactor settings it might still be possible to provoke the empty_end_pages problem. I am thinking that the extra notup condition should be back-patched, at least as far back as we have fillfactor support, and maybe all the way back on general principles. Comments? regards, tom lane Index: vacuum.c === RCS file: /cvsroot/pgsql/src/backend/commands/vacuum.c,v retrieving revision 1.363 diff -c -r1.363 vacuum.c *** vacuum.c3 Jan 2008 21:23:15 - 1.363 --- vacuum.c10 Feb 2008 21:07:25 - *** *** 1659,1670 free_space += vacpage-free; /* !* Add the page to fraged_pages if it has a useful amount of free !* space. Useful means enough for a minimal-sized tuple. But we !* don't know that accurately near the start of the relation, so add !* pages unconditionally if they have = BLCKSZ/10 free space. */ ! do_frag = (vacpage-free = min_tlen || vacpage-free = BLCKSZ / 10); if (do_reap || do_frag) { --- 1659,1676 free_space += vacpage-free; /* !* Add the page to vacuum_pages if it requires reaping, and add it to !* fraged_pages if it has a useful amount of free space. Useful !* means enough for a minimal-sized tuple. But we don't know that !* accurately near the start of the relation, so add pages !* unconditionally if they have = BLCKSZ/10 free space. Also !* forcibly add pages with no live tuples, to avoid confusing the !* empty_end_pages logic. (In the presence of unreasonably small !* fillfactor, it seems possible that such pages might not pass !* the free-space test, but they had better be in the list anyway.) */ ! do_frag = (vacpage-free = min_tlen || vacpage-free = BLCKSZ / 10 || ! notup); if (do_reap || do_frag) { *** *** 1679,1684 --- 1685,1691 /* * Include the page in empty_end_pages if it will be empty after * vacuuming; this is to keep us from using it as a move destination. +* Note that such pages are guaranteed to be in fraged_pages. */ if (notup) { *** *** 3725,3731 static Size PageGetFreeSpaceWithFillFactor(Relation relation, Page page) { ! Sizefreespace = PageGetHeapFreeSpace(page); Sizetargetfree; targetfree = RelationGetTargetPageFreeSpace(relation, --- 3732,3750 static Size PageGetFreeSpaceWithFillFactor(Relation relation, Page page) { ! /* !* It is correct to use PageGetExactFreeSpace() here, *not* !* PageGetHeapFreeSpace(). This is because (a) we do our own, exact !* accounting for whether line pointers must be added, and (b) we will !* recycle any LP_DEAD line pointers before starting to add rows to a !* page, but that may not have happened yet at the time this function is !* applied to a page, which means PageGetHeapFreeSpace()'s protection !* against too many line pointers on a page could fire incorrectly. We do !* not need that protection here: since VACUUM FULL always recycles all !* dead line pointers first, it'd be physically impossible to insert more !* than MaxHeapTuplesPerPage tuples anyway. !*/ ! Sizefreespace = PageGetExactFreeSpace(page); Sizetargetfree; targetfree
Re: [PATCHES] NUMERIC key word
Neil Conway [EMAIL PROTECTED] writes: On Tue, 2008-01-29 at 13:20 -0500, Tom Lane wrote: The reason it was kept was to override the search path --- unqualified NUMERIC will always be taken as pg_catalog.numeric even if you have some other type numeric in front of it. It should be possible to implement this behavior without requiring NUMERIC to be a keyword, though. Perhaps we could find some other, even uglier kludge ... I doubt it would be an improvement. Is there any particular reason NUMERIC *shouldn't* be a keyword? It's called out as a reserved word by the spec, after all. (In fact, I seem to recall that it was exactly that point that made us decide that the implicit conversion to pg_catalog.numeric was appropriate.) regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Proposed patch for bug #3921
Gregory Stark [EMAIL PROTECTED] writes: So the syntax would be CREATE TABLE foo (..., LIKE bar INCLUDING INDEXES USING INDEX TABLESPACE foo_ts, ...) This (presumably) forces all the indexes into the same tablespace, so I don't find it to be a complete solution, just a wart. We could get the same behavior with much less code if we redefined LIKE to not try to copy the source indexes' tablespace(s). Then, the user would set default_tablespace to get the effect of the USING clause. That would also make LIKE entirely free of tablespace permissions hazards, so I'm starting to think more and more that that's really the right definition. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PATCHES] Proposed patch for bug #3921
Attached is a proposed patch for bug #3921, which complained that CREATE TABLE LIKE INCLUDING INDEXES fails inappropriately for non-superusers. There are two parts to the patch: the first follows Greg Starks' opinion that explicitly specifying the current database's default tablespace shouldn't be an error at all, and so the permissions checks during table/index creation are suppressed when tablespaceId == MyDatabaseTableSpace. (Note that the assign hooks for default_tablespace and temp_tablespaces already behaved this way, so we were not exactly being consistent anyhow.) I couldn't find anyplace in the documentation that specifies the old behavior, so no docs changes. The second part fixes the LIKE INCLUDING INDEXES code to default the index tablespace specification when the source index is in the database's default tablespace. This would provide an alternative way of fixing the bug if anyone doesn't like the first part. With the first part it's redundant, but seems worth doing anyway to avoid the overhead of looking up the tablespace name and then converting it back to OID in the typical case. Also there is a correction of an obsolete comment in parsenodes.h, which should be applied in any case. An issue that this patch doesn't address is what happens if the source index is in a non-default tablespace that the current user does not have CREATE permission for. With both current CVS HEAD and this patch, that will result in an error. Is that okay? We could imagine making parse_utilcmd.c check the permissions and default the tablespace specification if no good, but that behavior seems a bit peculiar to me. Command semantics don't normally change based on whether you have permissions or not. An entirely different tack we could take is to adopt the position that LIKE INCLUDING INDEXES shouldn't copy index tablespaces at all, but I didn't hear anyone favoring that approach in the bug discussion. regards, tom lane Index: src/backend/commands/indexcmds.c === RCS file: /cvsroot/pgsql/src/backend/commands/indexcmds.c,v retrieving revision 1.170 diff -c -r1.170 indexcmds.c *** src/backend/commands/indexcmds.c9 Jan 2008 21:52:36 - 1.170 --- src/backend/commands/indexcmds.c3 Feb 2008 02:32:14 - *** *** 215,221 } /* Check permissions except when using database's default */ ! if (OidIsValid(tablespaceId)) { AclResult aclresult; --- 215,221 } /* Check permissions except when using database's default */ ! if (OidIsValid(tablespaceId) tablespaceId != MyDatabaseTableSpace) { AclResult aclresult; Index: src/backend/commands/tablecmds.c === RCS file: /cvsroot/pgsql/src/backend/commands/tablecmds.c,v retrieving revision 1.241 diff -c -r1.241 tablecmds.c *** src/backend/commands/tablecmds.c30 Jan 2008 19:46:48 - 1.241 --- src/backend/commands/tablecmds.c3 Feb 2008 02:32:15 - *** *** 340,346 } /* Check permissions except when using database's default */ ! if (OidIsValid(tablespaceId)) { AclResult aclresult; --- 340,346 } /* Check permissions except when using database's default */ ! if (OidIsValid(tablespaceId) tablespaceId != MyDatabaseTableSpace) { AclResult aclresult; Index: src/backend/executor/execMain.c === RCS file: /cvsroot/pgsql/src/backend/executor/execMain.c,v retrieving revision 1.302 diff -c -r1.302 execMain.c *** src/backend/executor/execMain.c 1 Jan 2008 19:45:49 - 1.302 --- src/backend/executor/execMain.c 3 Feb 2008 02:32:15 - *** *** 2594,2600 } /* Check permissions except when using the database's default space */ ! if (OidIsValid(tablespaceId)) { AclResult aclresult; --- 2594,2600 } /* Check permissions except when using the database's default space */ ! if (OidIsValid(tablespaceId) tablespaceId != MyDatabaseTableSpace) { AclResult aclresult; Index: src/backend/parser/parse_utilcmd.c === RCS file: /cvsroot/pgsql/src/backend/parser/parse_utilcmd.c,v retrieving revision 2.8 diff -c -r2.8 parse_utilcmd.c *** src/backend/parser/parse_utilcmd.c 1 Jan 2008 19:45:51 - 2.8 --- src/backend/parser/parse_utilcmd.c 3 Feb 2008 02:32:15 - *** *** 767,773 index = makeNode(IndexStmt); index-relation = cxt-relation; index-accessMethod = pstrdup(NameStr(amrec-amname)); ! index-tableSpace
Re: [PATCHES] NUMERIC key word
Peter Eisentraut [EMAIL PROTECTED] writes: In 8.3, it appears that NUMERIC doesn't need to be a key word any longer. See attached patch. Was there a reason this was kept in the parser? Otherwise we could remove it in 8.4. The reason it was kept was to override the search path --- unqualified NUMERIC will always be taken as pg_catalog.numeric even if you have some other type numeric in front of it. I believe we had concluded that this behavior is required by the SQL spec. In any case, it would be kinda weird for DECIMAL to have that behavior and NUMERIC not. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] WIP: plpgsql source code obfuscation
Andrew Dunstan [EMAIL PROTECTED] writes: Maybe a better TODO would be to do this task in the way that has previously been suggested: http://archives.postgresql.org/pgsql-hackers/2007-08/msg00258.php I'm certainly not happy about any proposal to put a password/key in a GUC var - that strikes me as a major footgun. We didn't really have a better solution to the key management problem, though, did we? At least I don't see anything about it in that thread. However, I definitely agree that a separate loadable PL is the way to go for functionality of this sort. There is no way that a dependency on pgcrypto is going to be accepted into core, not even in the (ahem) obfuscated way that it's presented here. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] WIP: plpgsql source code obfuscation
Gregory Stark [EMAIL PROTECTED] writes: There is a validator function which gets called when you create a function but I don't think it has any opportunity to substitute its result for the original in prosrc. It would have to do a heap_update on the prosrc row, but that doesn't seem like a showstopper problem. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] WIP: plpgsql source code obfuscation
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: However, I definitely agree that a separate loadable PL is the way to go for functionality of this sort. There is no way that a dependency on pgcrypto is going to be accepted into core, not even in the (ahem) obfuscated way that it's presented here. If we do anything in core it could be to make provision for an obfuscation/encryption hook via a loadable module. My recollection is that certain cryptography laws make hooks for crypto just as problematic as actual crypto code. We'd have to tread very carefully --- general purpose hooks are OK but anything narrowly tailored to encryption purposes would be a hazard. This is one reason that I'd prefer to see it as an external PL rather than embedded in core. Various interesting encoding issues could arise with dumping and restoring transformed program text - I haven't thought that through yet. I think we have already solved that with md5 passwords, and could easily reuse the same kind of approach. You just base64 encode the crypted text (or whatever you need to do to avoid funny characters in it), and make sure that there's some way to distinguish already-crypted from not-already-crypted function bodies. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] WIP: plpgsql source code obfuscation
Andrew Dunstan [EMAIL PROTECTED] writes: using this example, it seems to me that if we dump the encrypted/encoded source and restore into another database with a different encoding, the decoded/decrypted source will still be in the old database encoding, i.e. not valid in the new database encoding. We've just gone around closing doors like this. Ah, right, I hadn't thought about that, but it would be a hazard. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [8.4] Updated WITH clause patch (non-recursive)
Gregory Stark [EMAIL PROTECTED] writes: I still hope to do recursive queries for 8.4 so I don't have strong feelings for this part either way. One question that hasn't been asked is whether this patch is likely to help, or to get in the way, for a more full-fledged implementation. I don't recall at the moment if Greg has a credible design sketch for the remaining work, but it might be a good idea to review that before deciding. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [8.4] Updated WITH clause patch (non-recursive)
Neil Conway [EMAIL PROTECTED] writes: (1) This is SQL-standard syntax (and not even wacko syntax, at that!), and there is merit in implementing it on those grounds alone. (2) It is supported by DB2, MS SQL and Oracle, so there is a further compatibility argument to be made. Both of the above arguments hold water only if we implement compatible *semantics*, not merely syntax, so I find them unconvincing at this stage. (3) It avoids the need to repeat subqueries multiple times in the main query, which can make queries more concise. Defining subqueries outside the main SELECT body can also have readability advantages. Views fix that too. If it doesn't provide any additional expressive capabilities then I think he didn't like taking with as a reserved word. If we're going to implement recursive queries eventually (which we almost surely will, whether in 8.4 or a future release), we'll need to make WITH more reserved at some point anyway, so I don't see much to be gained in the long run by delaying it. The point is that when you break people's apps you'll be able to point to some real increment in functionality to justify it. With the patch as it stands you'd essentially be saying we're going to cause you pain now for benefit later, which is a hard selling proposition. I'm not opposed to applying this patch if it's an incremental step along a clearly defined path to full WITH support in 8.4. I'm less eager to put it in if there's not a plan and a commitment to make that happen. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Proposed patch: synchronized_scanning GUC variable
Per today's -hackers discussion, add a GUC variable to allow clients to disable the new synchronized-scanning behavior, and make pg_dump disable sync scans so that it will reliably preserve row ordering. This is a pretty trivial patch, but seeing how late we are in the 8.3 release cycle, I thought I'd better post it for comment anyway. regards, tom lane Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.162 diff -c -r1.162 config.sgml *** doc/src/sgml/config.sgml27 Jan 2008 19:12:28 - 1.162 --- doc/src/sgml/config.sgml27 Jan 2008 20:00:16 - *** *** 4611,4616 --- 4611,4638 /listitem /varlistentry + varlistentry id=guc-synchronized-scanning xreflabel=synchronized_scanning + termvarnamesynchronized_scanning/varname (typeboolean/type)/term + indexterm +primaryvarnamesynchronized_scanning/ configuration parameter/primary + /indexterm + listitem +para + This allows sequential scans of large tables to synchronize with each + other, so that concurrent scans read the same block at about the + same time and hence share the I/O workload. When this is enabled, + a scan might start in the middle of the table and then quotewrap + around/ the end to cover all rows, so as to synchronize with the + activity of scans already in progress. This can result in + unpredictable changes in the row ordering returned by queries that + have no literalORDER BY/ clause. Setting this parameter to + literaloff/ ensures the pre-8.3 behavior in which a sequential + scan always starts from the beginning of the table. The default + is literalon/. +/para + /listitem + /varlistentry + /variablelist /sect2 Index: src/backend/access/heap/heapam.c === RCS file: /cvsroot/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.248 diff -c -r1.248 heapam.c *** src/backend/access/heap/heapam.c14 Jan 2008 01:39:09 - 1.248 --- src/backend/access/heap/heapam.c27 Jan 2008 20:00:18 - *** *** 59,64 --- 59,68 #include utils/syscache.h + /* GUC variable */ + bool synchronized_scanning = true; + + static HeapScanDesc heap_beginscan_internal(Relation relation, Snapshot snapshot, int nkeys, ScanKey key, *** *** 104,110 * the thresholds for these features could be different, we make them the * same so that there are only two behaviors to tune rather than four. * (However, some callers need to be able to disable one or both of !* these behaviors, independently of the size of the table.) * * During a rescan, don't make a new strategy object if we don't have to. */ --- 108,115 * the thresholds for these features could be different, we make them the * same so that there are only two behaviors to tune rather than four. * (However, some callers need to be able to disable one or both of !* these behaviors, independently of the size of the table; also there !* is a GUC variable that can disable synchronized scanning.) * * During a rescan, don't make a new strategy object if we don't have to. */ *** *** 129,135 scan-rs_strategy = NULL; } ! if (allow_sync) { scan-rs_syncscan = true; scan-rs_startblock = ss_get_location(scan-rs_rd, scan-rs_nblocks); --- 134,140 scan-rs_strategy = NULL; } ! if (allow_sync synchronized_scanning) { scan-rs_syncscan = true; scan-rs_startblock = ss_get_location(scan-rs_rd, scan-rs_nblocks); Index: src/backend/utils/misc/guc.c === RCS file: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v retrieving revision 1.431 diff -c -r1.431 guc.c *** src/backend/utils/misc/guc.c27 Jan 2008 19:12:28 - 1.431 --- src/backend/utils/misc/guc.c27 Jan 2008 20:00:18 - *** *** 110,115 --- 110,116 extern intCommitSiblings; extern char *default_tablespace; extern char *temp_tablespaces; + extern bool synchronized_scanning; extern bool fullPageWrites; #ifdef TRACE_SORT *** *** 1053,1058 --- 1054,1068 }, { + {synchronized_scanning, PGC_USERSET, COMPAT_OPTIONS_PREVIOUS, + gettext_noop(Enable synchronized scans.), + NULL
Re: [PATCHES] Proposed patch: synchronized_scanning GUC variable
Gregory Stark [EMAIL PROTECTED] writes: I liked the synchronized_sequential_scans idea myself. The name is still open for discussion --- it's an easy search-and-replace in the patch ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] sinval contention reduction
Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2008-01-25 at 19:02 -0500, Tom Lane wrote: This seems large, complex, and untested (I note in particular a guaranteed-to-fail Assert). Yes, its for discussion. How would you describe such a patch in the future? I want to be able to differentiate patch status. Completely untested might be an appropriate description ... We only clean the queue if a long run of messages is read by the oldest message reader, so when stateP-nextMsgNum == segP-minMsgNum number of messages read 25% of queue. So that is only performed by a backend waking up to find it is behind, such as would happen if a PM signal had been issued. You still haven't explained why that won't happen in parallel within many backends at once. I really think we need to fix things so that catchup interrupts occur serially instead of all at once. Do you have any evidence for performance improvement? ISTM that it would only be worth testing when we had a rough agreement that we had found a reasonable approach. Well, what I'd like to do for 8.4 is a considerably more invasive rethink of the signaling logic. I'm not opposed in principle to some simpler stopgap fix for 8.3, but at this late stage of the cycle there would have to be a pretty compelling performance-improvement case for it. So I'm not sure of the point of posting a patch that's not even ready for performance testing. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] sinval contention reduction
Simon Riggs [EMAIL PROTECTED] writes: Patch to reduce the contention on SInvalLock, as discussed here: http://archives.postgresql.org/pgsql-hackers/2007-09/msg00501.php and http://archives.postgresql.org/pgsql-performance/2008-01/msg00023.php For discussion. This seems large, complex, and untested (I note in particular a guaranteed-to-fail Assert). I'm also wondering if it will help much, since unless the system is already in trouble, the normal case will be that all backends have absorbed all messages and so they'll all see stateP-nextMsgNum == segP-minMsgNum when they first respond to a signal. Do you have any evidence for performance improvement? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Thoughts about bug #3883
I wrote: No, the problem is merely to get LockWaitCancel to return true so that StatementCancelHandler will go ahead with the immediate interrupt. No new cleanup is needed other than resetting the new flag. Actually there's a better way to do it: we can remove a little bit of complexity instead of adding more. The basic problem is that StatementCancelHandler wants to tell the difference between waiting for client input (which there is no use for it to interrupt) and other states in which ImmediateInterruptOK is set. ProcWaitForSignal() is falling on the wrong side of the classification --- and I argue that any other newly added interruptable wait would too. We should reverse the sense of the test so that it's not in client input instead of in lock wait. At the time that code was written, there wasn't any conveniently cheap way to check for client input state, so we kluged up LockWaitCancel to detect the other case. But now that we have the DoingCommandRead flag, it's easy to check that. This lets us remove LockWaitCancel's return value (which was always a bit untidy, since all but one of its callers ignored the result), ending up with exactly parallel code in die() and StatementCancelHandler(). Seems cleaner than before. regards, tom lane Index: src/backend/port/posix_sema.c === RCS file: /cvsroot/pgsql/src/backend/port/posix_sema.c,v retrieving revision 1.19 diff -c -r1.19 posix_sema.c *** src/backend/port/posix_sema.c 1 Jan 2008 19:45:51 - 1.19 --- src/backend/port/posix_sema.c 25 Jan 2008 22:41:22 - *** *** 241,277 int errStatus; /* !* Note: if errStatus is -1 and errno == EINTR then it means we returned !* from the operation prematurely because we were sent a signal. So we !* try and lock the semaphore again. !* !* Each time around the loop, we check for a cancel/die interrupt. We !* assume that if such an interrupt comes in while we are waiting, it will !* cause the sem_wait() call to exit with errno == EINTR, so that we will !* be able to service the interrupt (if not in a critical section !* already). !* !* Once we acquire the lock, we do NOT check for an interrupt before !* returning. The caller needs to be able to record ownership of the lock !* before any interrupt can be accepted. !* !* There is a window of a few instructions between CHECK_FOR_INTERRUPTS !* and entering the sem_wait() call. If a cancel/die interrupt occurs in !* that window, we would fail to notice it until after we acquire the lock !* (or get another interrupt to escape the sem_wait()). We can avoid this !* problem by temporarily setting ImmediateInterruptOK to true before we !* do CHECK_FOR_INTERRUPTS; then, a die() interrupt in this interval will !* execute directly. However, there is a huge pitfall: there is another !* window of a few instructions after the sem_wait() before we are able to !* reset ImmediateInterruptOK. If an interrupt occurs then, we'll lose !* control, which means that the lock has been acquired but our caller did !* not get a chance to record the fact. Therefore, we only set !* ImmediateInterruptOK if the caller tells us it's OK to do so, ie, the !* caller does not need to record acquiring the lock. (This is currently !* true for lockmanager locks, since the process that granted us the lock !* did all the necessary state updates. It's not true for Posix semaphores !* used to implement LW locks or emulate spinlocks --- but the wait time !* for such locks should not be very long, anyway.) */ do { --- 241,250 int errStatus; /* !* See notes in sysv_sema.c's implementation of PGSemaphoreLock. !* Just as that code does for semop(), we handle both the case where !* sem_wait() returns errno == EINTR after a signal, and the case !* where it just keeps waiting. */ do { Index: src/backend/port/sysv_sema.c === RCS file: /cvsroot/pgsql/src/backend/port/sysv_sema.c,v retrieving revision 1.22 diff -c -r1.22 sysv_sema.c *** src/backend/port/sysv_sema.c1 Jan 2008 19:45:51 - 1.22 --- src/backend/port/sysv_sema.c25 Jan 2008 22:41:22 - *** *** 377,386 * from the operation prematurely because we were sent a signal. So we * try and lock the semaphore again. * !* Each time around the loop, we check for a cancel/die interrupt. We !* assume that if such an interrupt comes in while we are waiting
Re: [PATCHES] Friendly help for psql
Peter Eisentraut [EMAIL PROTECTED] writes: Greg Sabino Mullane wrote: Why not run help when someone enters help (or HELP ME!) on the command line? \? is hardly an easy thing to remember (and some people can't be bothered to actually read the screen...) Then surely the help output won't be of use to them either. The actual argument for doing this is nothing more nor less than mysql does it like that. 99% of the people who will tell you this is user-friendly think so because they used mysql before coming to postgres. That might be sufficient reason to do it; I'm not sure. Personally I find it a really bad idea for psql to be usurping syntax that doesn't start with a backslash, but I don't suppose I'm representative of people who haven't absorbed the difference between psql and SQL. Note that the mysql help facility covers both the mysql program and SQL commands (ie both \? and \h in our terminology) so the proposed patch is going to be seen as pretty lacking anyway by mysql-trained users. It's interesting to note that help, \h, and \? all provoke the same response(s) in mysql. Perhaps a patch that had had more than two seconds' design effort in it would do the same in psql; though I'm not sure what to do to disambiguate the case with no arguments. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Friendly help for psql
Simon Riggs [EMAIL PROTECTED] writes: I'd be happy with output that explains briefly the difference between psql and SQL commands and refers people to \? and \h. That way we don't have to introduce too much change and this can be a forgivable special case. (Don't like the Help me thing though). I think you're headed in the same direction as Alvaro. A slight extension on this would be help produces short blurb describing \h and \? help something produces same result as \h something regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Friendly help for psql
Peter Eisentraut [EMAIL PROTECTED] writes: Now I have actually thought for a while whether we could get rid of the login text altogether. I would support trading that for extended help options. That makes sense to me. I think people are accustomed to ignoring login headers because they are usually just copyright boilerplate. I could see reducing the header to something like $ psql Welcome to psql 8.3RC2, the PostgreSQL interactive terminal. Type \? for help, \q to quit. dbname=# in the normal case. (The wrong-backend-version warning and perhaps the SSL encryption notice would remain as possible additions.) We really shouldn't be designing this on -patches, though. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [GENERAL] Forgot to dump old data before re-installing machine
Dave Page [EMAIL PROTECTED] writes: On 18/01/2008, Peter Eisentraut [EMAIL PROTECTED] wrote: I didn't follow how the user got into this mess, so I don't know whether the suggestion you need to initdb is appropriate. I would think not, as you almost certainly must be doing a file level restore of the data directory to get into this state and therefore probably want to keep your data. You could argue that that's true if there's an incompatible pg_control file there at all; I'm not sure why wrong-endianness is different from wrong-version or wrong-datetime-option or anything else. IIRC the HINT to initdb was originally kind of pointed at developers who'd just downloaded a new CVS update with a new catversion number. It might not be so appropriate for the field. The worst case scenario would be someone blindly following the hint and blowing away their old data ... regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [GENERAL] Forgot to dump old data before re-installing machine
Peter Eisentraut [EMAIL PROTECTED] writes: Here is a possible patch. Example output: $ pg-install/bin/postgres -D pg-install/var/data FATAL: database files are incompatible with server DETAIL: The database cluster was initialized with PG_CONTROL_VERSION 1090715648 (0x4103), but the server was compiled with PG_CONTROL_VERSION 833 (0x0341). HINT: This could be a mismatched byte order. It looks like you need to initdb. I didn't follow how the user got into this mess, so I don't know whether the suggestion you need to initdb is appropriate. I think it still is, at least as much as it is in other situations where we say that. I didn't like the wording of the other part of the hint though. Maybe This could be a problem of mismatched byte ordering? Other than that quibble, +1. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] SSL over Unix-domain sockets
Bruce Momjian [EMAIL PROTECTED] writes: Peter Eisentraut wrote: How does that prevent spoofing? It creates a lock file that is the same name as the socket file that a default-configured client would use, so it prevents a spoofed socket from being created. Only if the attacker didn't get there first. I think this idea is nothing but a crude kluge anyway... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] SSL over Unix-domain sockets
Alvaro Herrera [EMAIL PROTECTED] writes: I propose to create a dangling symlink on system startup in /tmp/.s.PGSQL.port to the real socket, which is not on a world-writable directory. This avoids the spoofer, because he cannot create the socket -- the symlink is occupying its place. The only problem with this proposal is that the tmp cleaner would remove the symlink. The solution to this is to configure the tmp cleaner so that it doesn't do that. It absolutely requires cooperation from the sysadmin, both to setup the symlink initially, and to configure the tmp cleaner. This is definitely a slick solution if you can overcome the tmp-cleaner risk; not least because it doesn't require any work on our part ;-). However, we should document the approach someplace. Further down the road we could think about Postgres changes to support such a strategy --- for instance, having the postmaster check to see if such a link exists. This will require more thought than we have time for for 8.3; also I think we'd need to negotiate with packagers, such as the Debian crew, to make sure any such behavior is acceptable to them. BTW, is a symlink's atime changed by accessing it? We could imagine adapting the existing postmaster code that keeps the socket atime fresh so that it will work on a symlink, thus providing partial protection against tmp-cleaners. Portability of this is uncertain... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] SSL over Unix-domain sockets
Bruce Momjian [EMAIL PROTECTED] writes: I am confused because you say dangling then you say to the real socket. You are saying it isn't dangling when the server is running? Exactly. When the server is running it provides a perfectly good path to the postmaster. The point (and the main difference from your PIDfile proposal) is that it's supposed to be there all the time, even when the postmaster isn't running. This is what provides protection against the spoofer getting there first. If you are going to require the admin to modify the tmp cleanup script, the admin might as well create the symlink at the same time and have it recreate on boot. No, that's not the same, because it doesn't provide protection against the symlink getting deleted later on. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PATCHES] Proposed patch for renaming constraints along with indexes
As per my suggestion just now for resolving bug #3854. I'm thinking in terms of applying this for 8.3, but not back-patching --- though I suppose depending on how seriously you take the bug, a case could be made for back-patching. I don't offhand see anyplace that this would need to be documented, except as a release-note item. Comments? regards, tom lane binKlTSMUpwon.bin Description: rename-index-constraints.patch ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Revised xml memory allocation patch
Attached is my revision of Alvaro's patch to change xml.c so that libxml memory allocations happen in a specially designated context. I find this more convincing than the existing technique of using whatever context xml.c happens to be called in. It should also be marginally faster because (barring errors) we only need to reset libxml once per transaction not once per xml.c function call. On the other hand it's kind of late in the beta cycle to be changing anything so low-level, and as of today I cannot point to any specific bug it would fix. Any thoughts whether to apply or not? Note: the patch removes most of the PG_TRY blocks in xml.c. For ease of reading I did not reindent the contained code, but will do so before applying, if we choose to apply. regards, tom lane binymlB0qnjYe.bin Description: xml-mem-4.patch ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Revised patch for fixing archiver shutdown behavior
Simon Riggs [EMAIL PROTECTED] writes: If we do a shutdown immediate on the postmaster *after* the bgwriter has written a shutdown checkpoint, do we have any record that there was a panic stop? Do we enter recovery in that case? I think the answers are yes and no, but just checking. Yeah, the postmaster would complain at exit, but the pg_control state is already SHUTDOWN at that point. There'd be a log entry showing abnormal archiver exit if the panic had had any actual effect on it. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Revised patch for fixing archiver shutdown behavior
Alvaro Herrera [EMAIL PROTECTED] writes: Something I'm still wondering is about the archiver/logger combination. What happens if a postmaster is stopped by the user and the archiver is still running, and the user starts a new postmaster? This would launch a new archiver and logger; and there are now two loggers possibly writing to the same files, and truncated log lines could occur. I'm not nearly as worried about that as I am about the prospect of two concurrent archivers :-( There was discussion of having a lock file for the archiver, but it's still an open issue. I'm not sure how to solve the problem of stale lockfiles --- unlike the postmaster, the archiver can't assume that it's the only live process with the postgres userid. For example, after a system crash-and-restart, it's entirely likely that the PID formerly held by the archiver is now held by the bgwriter, making the lockfile (if any) look live. Maybe we should go back to the plan of having the postmaster wait for the archiver to exit. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Revised patch for fixing archiver shutdown behavior
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Maybe we should go back to the plan of having the postmaster wait for the archiver to exit. Yeah, that seems the safest to me -- the problem is that it complicates the shutdown sequence a fair bit, because postmaster must act differently depending on whether archiving is enabled or not: wait for bgwriter exit if disabled, or for archiver exit otherwise. Given the recent changes to make the postmaster act as a state machine, I don't think this is really a big deal --- it's just one more state. The bigger part is that the archiver can't wait for postmaster exit. We'll need a proper shutdown signal for the archiver, but since it's not using SIGUSR2 that can be commandeered easily. So it'd be like SIGUSR1 - do an archive cycle SIGUSR2 - do an archive cycle and exit no postmaster - just exit The rationale for the last is that it's a crash situation, and furthermore there's a risk of someone starting a new postmaster and a conflicting archiver. So we should put back the behavior my last patch removed of aborting archiving immediately on postmaster death. I'll respin my patch this way... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Revised patch for fixing archiver shutdown behavior
I wrote: I'll respin my patch this way... Third time's the charm? regards, tom lane binFKkWVCJKov.bin Description: archiver-shutdown-3.patch ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [BUGS] BUG #3860: xpath crashes backend when is querying xmlagg result
Alvaro Herrera [EMAIL PROTECTED] writes: + void + AtEOXact_xml(void) + { + if (LibxmlContext == NULL) + return; + + MemoryContextDelete(LibxmlContext); + LibxmlContext = NULL; + + xmlCleanupParser(); + } [ blink... ] Shouldn't that be the other way around? It looks to me like xmlCleanupParser will be pfree'ing already-deleted data. Did you test this with CLOBBER_FREED_MEMORY active? Also, I don't see how this patch fixes what I believe to be the fundamental problem, which is that we have code paths that invoke libxml without having previously initialized the alloc support. I think the sequence if (LibxmlContext == NULL) { create LibxmlContext; xmlMemSetup(...); } ought to be executed before *any* use of libxml stuff (which suggests factoring it out as a subroutine...) We also need to do some testing to see if letting the thing go until transaction end is OK, or whether we will see unacceptable memory leaks over long series of XML calls. (In particular I suspect that we'd better zap the context at subtransaction abort; but even without any error, are we sure that normal operations don't leak memory?) One thing I was wondering about earlier today is whether libxml isn't expecting NULL-return-on-failure from the malloc-substitute routine. If we take control away from it unexpectedly, I wouldn't be a bit surprised if its data structures are left corrupt. This might lead to failures during cleanup. I do like the idea of getting rid of the PG_TRY blocks and the associated cleanup attempts; with the approach you're converging on here, we need only assume that xmlCleanupParser() will get rid of all staticly-allocated pointers and not crash, whereas right now we are assuming a great deal of libxml stuff still works after an ereport (which makes throwing ereport from xml_palloc even more risky). regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [BUGS] BUG #3860: xpath crashes backend when is querying xmlagg result
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane escribió: One thing I was wondering about earlier today is whether libxml isn't expecting NULL-return-on-failure from the malloc-substitute routine. If we take control away from it unexpectedly, I wouldn't be a bit surprised if its data structures are left corrupt. This might lead to failures during cleanup. Hmm, this is a very good point. I quick look at the source shows that they are not very consistent on its own checking for memory allocation errors. For example, see a bug I just reported: http://bugzilla.gnome.org/show_bug.cgi?id=508662 Ugh. So we're pretty much damned if we do and damned if we don't. Given what you showed, it is certain that we are at risk if we return NULL, whereas it is merely hypothetical that we are at risk if we longjmp. So let's stick to the palloc infrastructure for now. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Archiver behavior at shutdown
Simon Riggs [EMAIL PROTECTED] writes: Not sure why this hasn't being applied yet for 8.3 Because it doesn't fix the problem ... which is that the postmaster kills the archiver (and the stats collector too) at what is now the wrong point in the shutdown sequence. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PATCHES] Revised patch for fixing archiver shutdown behavior
The attached patch fixes archiver shutdown in what seems to me to be a sane way. With the patch, we send SIGQUIT to the archiver only for panic-stop situations (backend crash or immediate-mode shutdown). This is important because the postmaster is coded to send SIGQUIT to the entire process group, meaning we'd also ungracefully terminate any currently-running archive command, which does not seem like a good idea for normal exit. Instead, the shutdown protocol is that *after* the bgwriter has quit, we send SIGUSR1 (ie, the normal archiver wakeup signal) to the archiver. This ensures that it will do a normal archiving cycle after the last possible creation of WAL entries. The archiver is also modified to fall out of its wait loop more quickly when the postmaster has died (this is the same as Simon's previous one-liner patch), and to not exit until it has completed a full archive cycle since the postmaster died. The latter eliminates a race condition that existed before --- depending on timing, the CVS-HEAD archiver might or might not do a final archiving cycle. I also tweaked the postmaster so that the stats collector isn't told to quit until after the bgwriter finishes; this ensures that any stats reported from the bgwriter will be collected. One point needing discussion is that the postmaster is currently coded not to send SIGUSR1 to the archiver if a fast-mode shutdown is under way. I duplicated that in the added SIGUSR1 signal here, but I wonder whether it is sane or not. Comments? regards, tom lane binEUCN9e9ool.bin Description: archiver-shutdown.patch ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Revised patch for fixing archiver shutdown behavior
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, so the postmaster is gone during the last archiving cycle? What about syslogger? Is the archiver able to log stuff in the last cycle? The logger is no problem --- it quits when it sees EOF on its input pipe, which means that all upstream processes are gone. The comment in line 2180 seems a bit bogus ...? Yeah, that could use a bit more work I guess, since normal children sounds like it would refer to more than just backends. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Revised patch for fixing archiver shutdown behavior
I wrote: One point needing discussion is that the postmaster is currently coded not to send SIGUSR1 to the archiver if a fast-mode shutdown is under way. I duplicated that in the added SIGUSR1 signal here, but I wonder whether it is sane or not. Comments? After chewing on that for awhile, I decided it was bogus. If we are going to have a policy that the archiver gets a chance to archive everything, that shouldn't depend on fast vs. smart shutdown; those alternatives determine whether we kick clients out ungracefully, not whether we take extra risks with committed data. I think we should allow the archiver to finish out its tasks fully in all non-crash cases except one: if we got SIGTERM from init. In that case there's a very great risk of being SIGKILL'd before we can finish archiving. The postmaster cannot easily tell whether its SIGTERM came from init or not, but we can drive this off the archiver itself getting SIGTERM'd. I propose that if the archiver receives SIGTERM, it should cease to issue any new archive commands, but just wait till it sees the postmaster exit. (It can't exit right away, since there's a race condition: the postmaster might not have been SIGTERM'd yet, and might therefore spawn a new archiver, which would have no idea it's unsafe to do anything more.) There's an obvious failure mode in that, which is that a randomly issued SIGTERM to the archiver would shut down archiving indefinitely. We can guard against that with a timeout: the archiver should exit a minute or two after being SIGTERM'd, even if the postmaster is still there. That should certainly be enough delay to avoid the race condition, and if in fact everything is still hunky-dory the postmaster will immediately spawn a new archiver. Hence, attached revised patch ... regards, tom lane binN52EbkMKOk.bin Description: archiver-shutdown-2.patch ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] OUTER JOIN performance regression remains in 8.3beta4
I wrote: Haven't looked closely at how to fix 8.2, yet. After some study it seems that the simplest, most reliable fix for 8.2 is to dike out the code that removes redundant outer join conditions after propagating a constant across them. This gives the right answer in the cases of concern (where we actually need the join condition) and doesn't really add much overhead in the cases where we don't need it. One small problem is that the join condition is redundant with the generated constant-equality constraints (mostly so, even if not entirely so) which will cause the planner to underestimate the size of the join, since clausesel.c is not very bright at all about redundant conditions. However, we already have a hack we can use for that: we can force the cached selectivity estimate for the join clause to 1.0, so that it's not considered to reduce the join size any more than the constant conditions already did. (This is also a problem in my earlier patch for 8.3, with the same fix possible.) That leads to the attached very simple patch. There is some dead code left behind, but it doesn't seem worth removing it. I'm rather tempted to patch 8.1 similarly, even though it doesn't fail on the known test case --- I'm far from convinced that there are no related cases that will make it fail, and in any case it's getting the selectivity wrong. 8.0 and before don't try to propagate constants like this, so they're not at risk. Comparing the behavior of this to my patch for HEAD, I am coming to the conclusion that this is actually a *better* planning method than removing the redundant join conditions, even when they're truly rendundant! The reason emerges as soon as you look at cases involving more than a single join. If we strip the join condition from just one of the joins, then we find that the planner insists on doing that join last, whether it's a good idea or not, because clauseful joins are always preferred to clauseless joins in the join search logic. What's worse, knowing that this is an outer join, is that the only available plan type for a clauseless outer join is a NestLoop with the inner side on the right, which again may be a highly nonoptimal way to do it. None of this matters a whole lot if the pushed-down constant conditions select single rows, but it does if they select multiple rows. I'm trying this in the regression database: select * from tenk1 a left join tenk1 b on (a.hundred = b.hundred) left join tenk1 c on (b.hundred = c.hundred) where a.hundred = 42; and finding patched 8.2 about 2X faster than 8.3 because it selects a better plan that avoids multiple rescans of subplans. So I'm coming around to the idea that getting rid of the redundant join conditions is foolish micro-optimization, and we should leave them in place even when we know they're redundant. The extra execution cycles paid to test the condition don't amount to much in any case, and the risk of getting a bad plan is too high. This is a reasonably simple adjustment to my prior patch for 8.3, which I will go ahead and make if there are no objections... regards, tom lane binleD9NAxImv.bin Description: const-propagation-8.2.patch ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] OUTER JOIN performance regression remains in 8.3beta4
Alvaro Herrera [EMAIL PROTECTED] writes: Would it be a good idea to keep removing redundant clauses and rethink the preference for clauseful joins, going forward? No --- it would create an exponential growth in planning time for large join problems, while not actually buying anything in the typical case. It's possible that we could do something along the lines of inserting dummy join conditions, to allow particular join paths to be explored, without generating any clause that actually requires work at runtime. I'm not convinced this complication is needed though; at least not if the only thing it's good for is this rather specialized optimization rule. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Fix for _outAgg()
Neil Conway [EMAIL PROTECTED] writes: Attached is a patch which fixes an oversight in _outAgg(): the grpColIdx and grpOperators fields of the Agg struct were not emitted by _outAgg(). I don't see any good reason to omit this information. Hmm, I think that must be my fault, but I'm not sure how it got by me ... I'm usually pretty careful about adding outfuncs support when I add a node field. Patch looks good, please apply. Note that while the grpOperators field was added during the 8.3 devel cycle, the grpColIdx field has been around since '02 (without outfuncs support). So I can backpatch that portion of the patch to back branches if anyone feels strongly. Not worth backpatching --- the plan node output support is only of interest to hard-core hackers, who'd be working on current sources. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] OUTER JOIN performance regression remains in 8.3beta4
Kevin Grittner [EMAIL PROTECTED] writes: There was a serious performance regression in OUTER JOIN planning going from 8.2.4 to 8.2.5. I know Tom came up with some patches to mitigate the issues in 8.2.5, but my testing shows that problems remain in 8.3beta4. Please try the attached proposed patch. It seems to fix my simplified test case, but I'm not sure if there are any additional considerations involved in your real queries. This is against CVS HEAD but I think it will apply cleanly to beta4. Haven't looked closely at how to fix 8.2, yet. regards, tom lane bin2pFPjTmHGu.bin Description: const-propagation-8.3.patch.gz ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] Unworkable column delimiter characters for COPY
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: It seems we ought to forbid delimiter from matching CSV quote or escape characters. I'll let you clean up that case though... This should do the trick - I'll apply it tomorrow. A couple thoughts: * This test needs to appear further down --- it is not sensible until after you've checked strlen() == 1 for all the strings involved. * I see that we disallow the CSV quote character from appearing in the null_print string, but not the escape character. Is this correct/sensible? If it is correct, maybe the delimiter could also match the escape character? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] typo in func.sgml
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= [EMAIL PROTECTED] writes: - individual data types is expected evolve in order to align the + individual data types is expected to evolve in order to align the Applied, thanks. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] patch to disallow zero length paths in binary (minor bug fix)
Merlin Moncure [EMAIL PROTECTED] writes: Following is a patch to force the path type not to accept a path with zero points. This appears to be illegal in the parser, but possible when sending a well formed packed in binary with zero points. Hmm, looks like poly_recv has the same mistake. Ideally I think it'd be a good idea if these types did allow zero elements, but that would clearly be a feature extension more than a bug fix, since the implications aren't entirely obvious. I concur with disallowing the case for existing branches. Will go apply. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] archiver ps display
Simon Riggs [EMAIL PROTECTED] writes: Patch to set ps display for archiver enclosed, intended for 8.3 I think this is a good idea. However, experimentation here showed that at least on some machines (like HPUX), there's a pretty tight limit before ps cuts off the display, and the more verbose messages discussed on -hackers cause the last few digits of the filename to be cut off. Which makes the display just about entirely useless. So I went with these messages: archiving %s last was %s failed on %s We might need to make it even briefer, perhaps at: %s last: %s failed: %s depending on reports from the field. Applied patch attached. regards, tom lane Index: pgarch.c === RCS file: /cvsroot/pgsql/src/backend/postmaster/pgarch.c,v retrieving revision 1.35 diff -c -r1.35 pgarch.c *** pgarch.c12 Dec 2007 16:53:14 - 1.35 --- pgarch.c18 Dec 2007 00:47:42 - *** *** 414,419 --- 414,420 { charxlogarchcmd[MAXPGPATH]; charpathname[MAXPGPATH]; + charactivitymsg[MAXFNAMELEN + 16]; char *dp; char *endp; const char *sp; *** *** 471,476 --- 472,482 ereport(DEBUG3, (errmsg_internal(executing archive command \%s\, xlogarchcmd))); + + /* Report archive activity in PS display */ + snprintf(activitymsg, sizeof(activitymsg), archiving %s, xlog); + set_ps_display(activitymsg, false); + rc = system(xlogarchcmd); if (rc != 0) { *** *** 527,537 --- 533,549 xlogarchcmd))); } + snprintf(activitymsg, sizeof(activitymsg), failed on %s, xlog); + set_ps_display(activitymsg, false); + return false; } ereport(DEBUG1, (errmsg(archived transaction log file \%s\, xlog))); + snprintf(activitymsg, sizeof(activitymsg), last was %s, xlog); + set_ps_display(activitymsg, false); + return true; } ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Auto create (top level) directory for create tablespace
Mark Kirkwood [EMAIL PROTECTED] writes: I thought it made sense for CREATE TABLESPACE to attempt to create the top level location directory - I thought we had deliberately made it not do that. Auto-recreate during replay sounds even worse. The problem is that a tablespace would normally be under a mount point, and auto-create has zero chance of getting such a path right. Ignoring this point is actually a fine recipe for destroying your data; see Joe Conway's report a couple years back about getting burnt by a soft NFS mount. If the DB directory is not there, auto-creating it is a horrible idea. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Proposed patch to avoid translation risks in psql's \d commands
Guillaume Lelarge [EMAIL PROTECTED] writes: Tom Lane a écrit : The attached patch does this, and seems to resolve Guillaume Lelarge's original complaint. It does resolve it. I applied your patch on my CVS HEAD checkout and it works just great. Patch is applied to CVS. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] Proposed patch to avoid translation risks in psql's \d commands
I proposed here: http://archives.postgresql.org/pgsql-hackers/2007-12/msg00436.php that we change the way that psql deals with localization of column names and other fixed strings in the output of \d and related commands (basically, anything that calls printQuery()). Specifically, we should avoid shipping already-translated strings to the server, because they might not be in the expected encoding and also might contain quote marks, which the existing code doesn't guard against. We can instead apply the gettext() conversion when the query results come back from the server. The attached patch does this, and seems to resolve Guillaume Lelarge's original complaint. I found just one place where the proposed new method doesn't work quite as nicely as the old. In \dC (describe casts), the Function column contains either a function name or '(binary compatible)' to indicate a cast WITHOUT FUNCTION. The existing code is able to localize '(binary compatible)', but this patch does not, because applying gettext to every value in the column seems to pose an unacceptably high risk of translating some function name that happens to match a string in psql's .PO database. I'm inclined to just live with that, since it seems a relatively minor deficiency, but I wonder if anyone has a better idea how to do it? regards, tom lane Index: src/bin/psql/describe.c === RCS file: /cvsroot/pgsql/src/bin/psql/describe.c,v retrieving revision 1.162 diff -c -r1.162 describe.c *** src/bin/psql/describe.c 15 Nov 2007 21:14:42 - 1.162 --- src/bin/psql/describe.c 12 Dec 2007 02:36:43 - *** *** 85,92 FROM pg_catalog.pg_proc p\n LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace\n WHERE p.proisagg\n, ! _(Schema), _(Name), _(Result data type), ! _(Argument data types), _(Description)); processSQLNamePattern(pset.db, buf, pattern, true, false, n.nspname, p.proname, NULL, --- 85,95 FROM pg_catalog.pg_proc p\n LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace\n WHERE p.proisagg\n, ! gettext_noop(Schema), ! gettext_noop(Name), ! gettext_noop(Result data type), ! gettext_noop(Argument data types), ! gettext_noop(Description)); processSQLNamePattern(pset.db, buf, pattern, true, false, n.nspname, p.proname, NULL, *** *** 101,106 --- 104,110 myopt.nullPrint = NULL; myopt.title = _(List of aggregate functions); + myopt.trans_headers = true; printQuery(res, myopt, pset.queryFout, pset.logfile); *** *** 131,143 SELECT spcname AS \%s\,\n pg_catalog.pg_get_userbyid(spcowner) AS \%s\,\n spclocation AS \%s\, ! _(Name), _(Owner), _(Location)); if (verbose) appendPQExpBuffer(buf, ! ,\n spcacl as \%s\ ,\n pg_catalog.shobj_description(oid, 'pg_tablespace') AS \%s\, ! _(Access privileges), _(Description)); appendPQExpBuffer(buf, \nFROM pg_catalog.pg_tablespace\n); --- 135,150 SELECT spcname AS \%s\,\n pg_catalog.pg_get_userbyid(spcowner) AS \%s\,\n spclocation AS \%s\, ! gettext_noop(Name), ! gettext_noop(Owner), ! gettext_noop(Location)); if (verbose) appendPQExpBuffer(buf, ! ,\n spcacl AS \%s\ ,\n pg_catalog.shobj_description(oid, 'pg_tablespace') AS \%s\, ! gettext_noop(Access privileges), ! gettext_noop(Description)); appendPQExpBuffer(buf, \nFROM pg_catalog.pg_tablespace\n); *** *** 155,160 --- 162,168 myopt.nullPrint = NULL
Re: [PATCHES] pgbench - startup delay
Dave Page [EMAIL PROTECTED] writes: Alvaro Herrera wrote: I think you could get the same effect by putting the -W in PGOPTIONS (in pgbench's environment). That's a good point. It does have the downside that it will affect the pgbench results - though that wouldn't actually be an issue for what I was doing. Well, if you're attaching a profiler or debugger to a backend, you're hardly gonna get unadulterated TPS readings from pgbench anyway. I concur with Alvaro that this case seems adequately covered by PGOPTIONS=-W n pgbench ... which is what I've always used in similar situations... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] pgbench - startup delay
Greg Smith [EMAIL PROTECTED] writes: I once wrote a similar patch to the one Dave submitted here and feel like it's worth committing at least a documentation patch to show how to deal with this. It's not obvious that pgbench pays attention to the environment variables at all, and it's even less obvious that you can pass what look like server options in there. It's not pgbench that is paying attention to this, it's libpq. This is at least referred to in the libpq and server documentation, eg the tenth paragraph here: http://developer.postgresql.org/pgdocs/postgres/config-setting.html It might be worth more emphasis, not sure. It doesn't come up all that often. I just poked around the documentation a bit and I didn't find anything that cleared up which options you can pass from a client; in addition to -W, I can imagine pgbench users might also want to use -S (sort memory) or -f (forbid scan/join types). If I can get someone to clarify what is supported there I can put together a pgbench doc patch that addresses this topic. Anything you'd be allowed to SET can be set from PGOPTIONS (-c or --var syntax). As for the special-purpose postgres command-line switches, I believe they are all equivalent to one or another GUC variable: http://developer.postgresql.org/pgdocs/postgres/runtime-config-short.html so the restrictions are the same as for the underlying variable. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] pgbench - startup delay
Greg Smith [EMAIL PROTECTED] writes: That clarifies the situation well enough for me. I think this is a two part problem then. It's not necessarily obvious that pgbench will use PGOPTIONS. In addition to that, the current documentation is less clear than it could be on the subject of what you can usefully put into PGOPTIONS. That's two small documentation patches I should be able to handle. BTW, PGOPTIONS is actually just the environment-variable fallback for the pgoptions argument to PQsetdbLogin() or the options=whatever component of the conninfo string for PQconnectdb() --- it's the same sort of animal as PGHOST or PGPORT. So those provide alternate paths for getting at the same functionality, and any documentation patch should be clear about this. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PATCHES] Proposed patch to disallow password=foo in database name parameter
As of PG 8.3, libpq allows a conninfo string to be passed in via the dbName parameter of PQsetdbLogin. This is to allow access to conninfo facilities in old programs that are still using PQsetdbLogin (including most of our own standard clients ... ahem). For instance psql service = foo Andrew Dunstan pointed out a possible security hole in this: it will allow people to do psql dbname = mydb password = mypassword which would leave their password exposed on the program's command line. While we cannot absolutely prevent client apps from doing stupid things, it seems like it might be a good idea to prevent passwords from being passed in through dbName. The attached patch (which depends on some pretty-recent changes in CVS HEAD) accomplishes this. Anybody think this is good, bad, or silly? Does the issue need explicit documentation, and if so where and how? regards, tom lane Index: fe-connect.c === RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v retrieving revision 1.354 diff -c -r1.354 fe-connect.c *** fe-connect.c9 Dec 2007 19:01:40 - 1.354 --- fe-connect.c11 Dec 2007 02:46:22 - *** *** 599,604 --- 599,618 { if (!connectOptions1(conn, dbName)) return conn; + + /* +* We disallow supplying a password through dbName, because a large +* number of applications allow dbName to be set from a command-line +* parameter, and putting a password on your command line is a horrid +* idea from a security point of view. +*/ + if (conn-pgpass_from_client) + { + conn-status = CONNECTION_BAD; + printfPQExpBuffer(conn-errorMessage, + libpq_gettext(password must not be set within database name parameter\n)); + return conn; + } } else { ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Proposed patch to disallow password=foo in database name parameter
Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: As of PG 8.3, libpq allows a conninfo string to be passed in via the dbName parameter of PQsetdbLogin. I didn't even know we could do that. I always use the shell variable option instead. Does anyone actually use the facility? Well, not yet, because it's new in 8.3 ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Proposed patch to disallow password=foo in database name parameter
Andrew Dunstan [EMAIL PROTECTED] writes: Stephen Frost wrote: I'm going to have to vote 'silly' on this one. It's a matter of being consistent. If we think such a facility shouldn't be provided on security grounds, then we shouldn't allow it via a backdoor, ISTM. Well, the problem with this approach is that libpq has no real means of knowing whether a string it's been passed was exposed on the command line or not. dbName might be secure, and for that matter the conninfo string passed to PQconnectdb might be insecure. Should we put in arbitrary restrictions on the basis of hypotheses about where these different arguments came from? It's also worth noting that we haven't removed the PGPASSWORD environment variable, even though that's demonstrably insecure on some platforms. I'm actually inclined to vote with Stephen that this is a silly change. I just put up the patch to show the best way of doing it if we're gonna do it ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PATCHES] Patch for PQconnectionUsedPassword brain-damage
Per the thread here, http://archives.postgresql.org/pgsql-hackers/2007-12/msg00174.php the 8.3 patch to create PQconnectionUsedPassword is quite a few bricks shy of a load --- the aforesaid function, which was intended to serve two different purposes, fails to serve either one correctly. Attached is my proposed patch to fix this, per the thread discussion. But given that the misdesign was my fault to begin with, maybe some other folk better look this over before it goes in :-( The main bit of ugliness here is that conninfo_parse's API has to be hacked up, because as it stands it's not possible for the caller to tell whether a password came from the conninfo string or a PGPASSWORD environment variable. It would perhaps be nicer to add an option source enum field to struct PQconninfoOption, but that would be a libpq ABI break which I don't think we want right now. So I settled for an ugly special case, instead. regards, tom lane Index: doc/src/sgml/libpq.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/libpq.sgml,v retrieving revision 1.247 diff -c -r1.247 libpq.sgml *** doc/src/sgml/libpq.sgml 28 Nov 2007 15:42:31 - 1.247 --- doc/src/sgml/libpq.sgml 8 Dec 2007 23:13:35 - *** *** 1146,1156 /varlistentry varlistentry termfunctionPQconnectionUsedPassword/functionindextermprimaryPQconnectionUsedPassword///term listitem para Returns true (1) if the connection authentication method !required a password to be supplied. Returns false (0) if not. synopsis int PQconnectionUsedPassword(const PGconn *conn); --- 1146,1177 /varlistentry varlistentry + termfunctionPQconnectionNeedsPassword/functionindextermprimaryPQconnectionNeedsPassword///term + listitem + para +Returns true (1) if the connection authentication method +required a password, but none was available. +Returns false (0) if not. + +synopsis + int PQconnectionNeedsPassword(const PGconn *conn); +/synopsis + + /para + + para +This function can be applied after a failed connection attempt +to decide whether to prompt the user for a password. + /para + /listitem + /varlistentry + + varlistentry termfunctionPQconnectionUsedPassword/functionindextermprimaryPQconnectionUsedPassword///term listitem para Returns true (1) if the connection authentication method !used a caller-supplied password. Returns false (0) if not. synopsis int PQconnectionUsedPassword(const PGconn *conn); *** *** 1159,1167 /para para !This function can be applied after either successful or failed !connection attempts. In the case of failure, it can for example !be used to decide whether to prompt the user for a password. /para /listitem /varlistentry --- 1180,1189 /para para !This function detects whether a password supplied to the connection !function was actually used. Passwords obtained from other !sources (such as the filename.pgpass/ file) are not considered !caller-supplied. /para /listitem /varlistentry Index: doc/src/sgml/release.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/release.sgml,v retrieving revision 1.563 diff -c -r1.563 release.sgml *** doc/src/sgml/release.sgml 8 Dec 2007 17:24:03 - 1.563 --- doc/src/sgml/release.sgml 8 Dec 2007 23:13:37 - *** *** 2184,2191 listitem para !Add functionPQconnectionUsedPassword()/function that returns !true if the server required a password (Joe Conway) /para para --- 2184,2192 listitem para !Add functionPQconnectionNeedsPassword()/function that returns !true if the server required a password but none was supplied !(Joe Conway, Tom) /para para *** *** 2197,2202 --- 2198,2216 /para /listitem + listitem + para +Add functionPQconnectionUsedPassword()/function that returns +true if the supplied password was actually used +(Joe Conway, Tom) + /para + + para +This is useful in some security contexts where it is important +to know whether a user-supplied password is actually valid. + /para + /listitem + /itemizedlist /sect3 Index: src/bin/pg_ctl/pg_ctl.c === RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v retrieving revision 1.90 diff -c -r1.90 pg_ctl.c *** src/bin/pg_ctl/pg_ctl.c 20 Nov
Re: [PATCHES] [PATCH] automatic integer conversion
Alvaro Herrera [EMAIL PROTECTED] writes: Please note that I'm not saying that fixing that issue means the patch is acceptable. Personally I'm not sure that this is a worthy goal you are pursuing here. Wouldn't it be a good idea to propose the feature first and write the code later? Indeed. For starters, if we are going to try to provide serious support in libpq for binary-format parameters, it probably ought to cover more than just integers. OTOH, I think we've already seen where that line of thought leads, and it looks pretty ugly too: http://archives.postgresql.org/pgsql-patches/2007-12/msg00014.php Anyway, I'd like to see a design discussion about what any libpq API changes in this area ought to look like, rather than having it be determined by who can submit the quickest-and-dirtiest patch. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Proposed patch to make mergejoin cost estimation more symmetric
Simon Riggs [EMAIL PROTECTED] writes: I do have a longer term issue that there is no information provided by EXPLAIN to allow us to differentiate these two conditions. Sure there is: the new startup condition affects the estimated plan startup cost (only) and the existing termination condition affects the estimated total cost (only). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Proposed patch to make mergejoin cost estimation more symmetric
Gregory Stark [EMAIL PROTECTED] writes: What about a merge join against an empty table? I suppose there would just be no statistics? Yeah. The defenses against silly results in mergejoinscansel should be enough to prevent it from doing anything really insane. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Proposed patch to make mergejoin cost estimation more symmetric
Alvaro Herrera [EMAIL PROTECTED] writes: Yes, but how can you tell by looking at the explain? I think the point is that the fraction that would be skipped should be reported somehow. It is: it's directly reflected in the startup cost. Previously, a mergejoin would always have startup cost equal to the sum of its input startup costs (hence, zero in the cases of interest here). regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] PQParam version 0.5
Andrew Chernow [EMAIL PROTECTED] writes: Here is the lastest pgparam patch. It is patched against a fresh checkout on 2007-12-05. What is this for? Why is it a good idea? It appears to be a fairly enormous increase in the size of libpq's API, and I really don't think I want to buy into the idea that libpq should know explicitly about each and every backend datatype. The 100% lack of any documentation in the patch isn't helping you sell it, BTW. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Proposed patch to make mergejoin cost estimation more symmetric
There was some discussion earlier today http://archives.postgresql.org/pgsql-performance/2007-12/msg00081.php about how mergejoin cost estimation is not smart about the situation where we will have to scan a lot of rows on one side of the join before reaching the range of values found on the other side (and hence having some chance of finding join pairs). Ideally, that effect should be factored into the startup cost estimate for the join. This consideration is the exact dual of one that we already do have code for, namely that the merge can stop early if the range of values on one side ends long before that of the other. So I looked into whether that code couldn't be extended to handle this issue too. It turns out that it can be, and it actually becomes more symmetrical because it's considering both max and min values not just max. Also, I found that there were a couple of new-in-8.3 bugs in that area --- it's been extended to make estimates for descending-order mergejoins, but it wasn't getting that quite right. Since something needs to be done to that code anyway, I'm considering applying the attached patch. It's a bit larger than I would normally consider to be a good idea for an optional patch in late beta. But then again I wouldn't have the slightest hesitation about back-patching a change of this size after final release, so it seems attractive to put it in now. Aside from the patch, I have attached a test script that exercises merge join planning for some simple cases, and the plans output by the script in CVS HEAD with/without the patch. The cost estimates with the patch are in line with expectation, the estimates without, not so much. In particular, the existing bug can be seen at work here in that the sixth and eighth test cases (big join highm on b=h order by b desc and big join high on b=h order by b desc) are given unreasonably small cost estimates by the unpatched code. (Note: the two sets of numbers vary a bit because they were working with nonidentical ANALYZE statistics.) Any objections to applying the patch? regards, tom lane Index: src/backend/optimizer/path/costsize.c === RCS file: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v retrieving revision 1.189 diff -c -r1.189 costsize.c *** src/backend/optimizer/path/costsize.c 15 Nov 2007 22:25:15 - 1.189 --- src/backend/optimizer/path/costsize.c 6 Dec 2007 23:46:32 - *** *** 1372,1383 double outer_path_rows = PATH_ROWS(outer_path); double inner_path_rows = PATH_ROWS(inner_path); double outer_rows, ! inner_rows; double mergejointuples, rescannedtuples; double rescanratio; ! Selectivity outerscansel, ! innerscansel; Selectivity joininfactor; Pathsort_path; /* dummy for result of cost_sort */ --- 1372,1387 double outer_path_rows = PATH_ROWS(outer_path); double inner_path_rows = PATH_ROWS(inner_path); double outer_rows, ! inner_rows, ! outer_skip_rows, ! inner_skip_rows; double mergejointuples, rescannedtuples; double rescanratio; ! Selectivity outerstartsel, ! outerendsel, ! innerstartsel, ! innerendsel; Selectivity joininfactor; Pathsort_path; /* dummy for result of cost_sort */ *** *** 1444,1453 * A merge join will stop as soon as it exhausts either input stream * (unless it's an outer join, in which case the outer side has to be * scanned all the way anyway). Estimate fraction of the left and right !* inputs that will actually need to be scanned. We use only the first !* (most significant) merge clause for this purpose. Since !* mergejoinscansel() is a fairly expensive computation, we cache the !* results in the merge clause RestrictInfo. */ if (mergeclauses path-jpath.jointype != JOIN_FULL) { --- 1448,1459 * A merge join will stop as soon as it exhausts either input stream * (unless it's an outer join, in which case the outer side has to be * scanned all the way anyway). Estimate fraction of the left and right !* inputs that will actually need to be scanned. Likewise, we can !* estimate the number of rows that will be skipped before the first !* join pair is found, which should be factored into startup cost. !* We use only the first (most significant
Re: [PATCHES] [COMMITTERS] pgsql: Remove obsoleted README files.
Hannes Eder [EMAIL PROTECTED] writes: Don't try to install a missing file ;) Install.pm needs an update, see attached .diff Ooops :-(. Looks like Magnus got to this mere seconds before I did. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [DOCS] Partition: use triggers instead of rules
David Fetter [EMAIL PROTECTED] writes: On Fri, Nov 30, 2007 at 12:34:05PM +0530, NikhilS wrote: Another reason to go along with triggers is that COPY honors triggers, but does not honor rules. While trying to do bulk inserts into a parent of partitioned tables where rules are being employed, the COPY operation will not be so straightforward. Does my latest patch attached address this well enough? Applied with revisions and extensions. (I take it you hadn't actually tested the example :-() regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend