Re: [HACKERS] Range Types, constructors, and the type system
On Thu, 2011-06-30 at 12:28 +0200, Florian Pflug wrote: > Well, arrays are containers, and we need two values to construct a range, What about empty ranges? What about infinite ranges? It seems quite a bit more awkward to shoehorn ranges into an array than to use a real type (even if it's intermediate and otherwise useless). > Hm, I guess. I'm sill no huge fan of RANGEINPUT, but if we prevent > it from being used as a column type and from being used as an argument > type, then I guess it's workable... > > Btw, what happened to the idea of making RANGE(...) a special syntactic > construct instead of a normal function call? Did we discard that for its > intrusiveness, or were there other reasons? It has not been discarded; as far as I'm concerned it's still on the table. The main advantage is that it doesn't require an intermediate type, and that requiring a cast (or some specification of the range type) might be a little more natural. The downside is that, well, it's new syntax, and there's a little inertia there. But if it's actually better, we should do it. If an intermediate type seems to be problematic, or if people think it's strange to require casting, then I think this is reasonable. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Thu, 2011-06-30 at 09:59 -0700, David E. Wheeler wrote: > On Jun 30, 2011, at 9:34 AM, Jeff Davis wrote: > > > Then how do you get a text range that doesn't correspond to the > > LC_COLLATE setting? > > You cast it. My original solution was something like this, except involving domains. With a sufficient amount of casting of all arguments to anything involving a range type, it works, but it's a little too un-SQL-like. There was at least one fairly strong objection to my approach, but if you have some further thoughts along that line, I'm open to suggestion. Also, what if the LC_COLLATE is C, and you want to cast it to en_US collation? range('a','Z') would be invalid in the C locale, and it would fail before you had a chance to cast it. > Cast where you need it explicit, and have a reasonable default when > it's not cast. I thought about that, too, but it's not ideal, either. That means that something might start out as the only range type for a given subtype, and doesn't need explicit casts. Then you define another range type over that subtype, and all the original queries break because they are now ambiguous. I think the fundamental differences with range types that we're dealing with are: 1. multiple range types might reasonbly exist for a single subtype 2. the order is a fundamental part of the type definition, not just an extra argument useful for operations on the range type Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] PANIC while doing failover (streaming replication)
On Fri, Jul 1, 2011 at 2:18 PM, Mikko Partio wrote: > Hello list > I have two a machine cluster with PostgreSQL 9.0.4 and streaming > replication. In a normal situation I did a failover -- touched the trigger > file in standby to promote it to production mode. I have done this > previously without any complications but now the to-be-production-database > had a PANIC and shut itself down. From the logs: > postgres[10751]: [2-1] 2011-06-30 17:25:24 EEST [10751]: [1-1] user=,db= > LOG: streaming replication successfully connected to primary > postgres[10736]: [10-1] 2011-07-01 07:50:29 EEST [10736]: [10-1] user=,db= > LOG: trigger file found: /postgresql/data/finish_replication.trigger > postgres[10751]: [3-1] 2011-07-01 07:50:29 EEST [10751]: [2-1] user=,db= > FATAL: terminating walreceiver process due to administrator command > postgres[10736]: [11-1] 2011-07-01 07:50:30 EEST [10736]: [11-1] user=,db= > LOG: redo done at AE/8B1855C8 > postgres[10736]: [12-1] 2011-07-01 07:50:30 EEST [10736]: [12-1] user=,db= > LOG: last completed transaction was at log time 2011-07-01 > 07:50:14.67424+03 > postgres[10736]: [13-1] 2011-07-01 07:50:30 EEST [10736]: [13-1] user=,db= > LOG: restored log file "000600AE008B" from archive > postgres[10736]: [14-1] 2011-07-01 07:50:30 EEST [10736]: [14-1] user=,db= > LOG: selected new timeline ID: 7 > postgres[10736]: [15-1] 2011-07-01 07:50:30 EEST [10736]: [15-1] user=,db= > LOG: restored log file "0006.history" from archive > postgres[10736]: [16-1] 2011-07-01 07:50:30 EEST [10736]: [16-1] user=,db= > LOG: archive recovery complete > postgres[10736]: [17-1] 2011-07-01 07:50:30 EEST [10736]: [17-1] user=,db= > WARNING: page 68975 of relation base/3711580/4398155 was uninitialized > postgres[10736]: [18-1] 2011-07-01 07:50:30 EEST [10736]: [18-1] user=,db= > WARNING: page 782 of relation base/3711580/4395394 was uninitialized > postgres[10736]: [19-1] 2011-07-01 07:50:30 EEST [10736]: [19-1] user=,db= > WARNING: page 68948 of relation base/3711580/4398155 was uninitialized > postgres[10736]: [20-1] 2011-07-01 07:50:30 EEST [10736]: [20-1] user=,db= > WARNING: page 68986 of relation base/3711580/4398155 was uninitialized > [ there are maybe 50 lines of these WARNING messages for all together six > different relations, all indexes ] > postgres[10736]: [73-1] 2011-07-01 07:50:31 EEST [10736]: [73-1] user=,db= > PANIC: WAL contains references to invalid pages > postgres[10727]: [2-1] 2011-07-01 07:50:31 EEST [10727]: [2-1] user=,db= > LOG: startup process (PID 10736) was terminated by signal 6: Aborted > postgres[10727]: [3-1] 2011-07-01 07:50:31 EEST [10727]: [3-1] user=,db= > LOG: terminating any other active server processes > Then I started postgres as it was shut down: > postgres[12191]: [1-1] 2011-07-01 07:50:59 EEST [12191]: [1-1] user=,db= > LOG: database system was interrupted while in recovery at log time > 2011-07-01 07:44:02 EEST > postgres[12191]: [1-2] 2011-07-01 07:50:59 EEST [12191]: [2-1] user=,db= > HINT: If this has occurred more than once some data might be corrupted and > you might need to choose an earlier recovery target. > postgres[12191]: [2-1] 2011-07-01 07:50:59 EEST [12191]: [3-1] user=,db= > LOG: database system was not properly shut down; automatic recovery in > progress > postgres[12191]: [3-1] 2011-07-01 07:50:59 EEST [12191]: [4-1] user=,db= > LOG: redo starts at AE/8B13E398 > postgres[12191]: [4-1] 2011-07-01 07:50:59 EEST [12191]: [5-1] user=,db= > LOG: consistent recovery state reached at AE/8C00 > postgres[12191]: [5-1] 2011-07-01 07:50:59 EEST [12191]: [6-1] user=,db= > LOG: unexpected pageaddr AE/6E00 in log file 174, segment 140, offset 0 > postgres[12191]: [6-1] 2011-07-01 07:50:59 EEST [12191]: [7-1] user=,db= > LOG: redo done at AE/8B1855C8 > postgres[12191]: [7-1] 2011-07-01 07:50:59 EEST [12191]: [8-1] user=,db= > LOG: last completed transaction was at log time 2011-07-01 > 07:50:14.67424+03 > postgres[12194]: [1-1] 2011-07-01 07:50:59 EEST [12194]: [1-1] user=,db= > LOG: autovacuum launcher started > postgres[12189]: [1-1] 2011-07-01 07:50:59 EEST [12189]: [1-1] user=,db= > LOG: database system is ready to accept connections > Are these log messages something to worry about? http://archives.postgresql.org/pgsql-hackers/2008-12/msg01335.php According to the above thread, this issue seems to exist for a few years, and be unsolved yet. Could you provide a self-contained test case? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add support for logging current role (what to review?)
On Fri, Jul 1, 2011 at 03:36, Alvaro Herrera wrote: > Excerpts from Stephen Frost's message of jue jun 30 18:35:40 -0400 2011: > >> > I know of no such list, and I think this field >> > useful/important enough that people who are using csv logging would >> > want it anyway. +1 on just tacking on the field and causing a flag day >> > for csv users. >> >> Honestly, I think it was *me* who raised the issue that we don't have a >> header for CSV logs and that it sucks for people using CSV files. We've >> changed it in the past (application_name was added, iirc) and there >> wasn't much noise of it that I recall. If everyone's happy with that, >> it's fine by me. > > I don't understand why it is so much a deal that 9.1 has a different CSV > table definition than 9.0 anyway (or any two release combination). As > long as both are clearly and correctly documented in the respective > pages, it shouldn't be an issue at all. If anyone attempts to load CSV > log files into the wrong table definition, the problem should make > itself evident pretty quickly. I don't see that as a big problem either. A file header would make it easier to detect for tools though - either by a regular CSV header or by just always logging a row with the version number in the first line of each file. Making it an actual CSV header has the added benefit of being a standard way that non-pg-specific tools know how to deal with.. If it's a pg specific tool, it's likely going to require version specific information anyway. This is just one of them. Now, if the order and ocntents of the fields is made entirely configurable, that basically creates an unlimited number of permutations, and *that* may make things really hard on tool developers. And without a header, it makes the tool developers work completely impossible - but harder even with. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Thu, 2011-06-30 at 09:58 -0700, David E. Wheeler wrote: > On Jun 30, 2011, at 9:29 AM, Jeff Davis wrote: > > > Right. In that respect, it's more like a record type: many possible > > record types exist, but you only define the ones you want. > > Well, okay. How is this same problem handled for RECORD types, then? What problem, exactly? For a given list of subtypes, there is only one valid record type. Also, record is not a great example. The implementation uses at least one pretty horrible hack. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
> On this commitfest, the goal of the patch is to be able to be > recovered using "Minimum recovery ending location" in the control file. Done. Regards. Jun Ishizuka NTT Software Corporation TEL:045-317-7018 E-Mail: ishizuka@po.ntts.co.jp standby_online_backup_02.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
Fujii Masao writes: > On Fri, Jul 1, 2011 at 2:25 AM, Robert Haas wrote: >>> Some actions aren't even transactional, such as DROP DATABASE, amongst >> >> Good point. We'd probably need to add a timestamp to the drop >> database record, as that's a case that people would likely want to >> defend against with this feature. > > This means that recovery_target_* code would also need to deal with > DROP DATABASE case. > there is no problem if you use "restore point" names... but of course you lose flexibility (ie: you can't restore to 5 minutes before now) mmm... a lazy idea: can't we just create a restore point wal record *before* we actually drop the database? then we won't need to modify logic about recovery_target_* (if it is only DROP DATABASE maybe that's enough about complicating code) and we can provide that protection since 9.1 -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL Soporte 24x7, desarrollo, capacitación y servicios -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add support for logging current role (what to review?)
Excerpts from Stephen Frost's message of jue jun 30 18:35:40 -0400 2011: > > I know of no such list, and I think this field > > useful/important enough that people who are using csv logging would > > want it anyway. +1 on just tacking on the field and causing a flag day > > for csv users. > > Honestly, I think it was *me* who raised the issue that we don't have a > header for CSV logs and that it sucks for people using CSV files. We've > changed it in the past (application_name was added, iirc) and there > wasn't much noise of it that I recall. If everyone's happy with that, > it's fine by me. I don't understand why it is so much a deal that 9.1 has a different CSV table definition than 9.0 anyway (or any two release combination). As long as both are clearly and correctly documented in the respective pages, it shouldn't be an issue at all. If anyone attempts to load CSV log files into the wrong table definition, the problem should make itself evident pretty quickly. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Fri, Jul 1, 2011 at 3:25 AM, Robert Haas wrote: > On Thu, Jun 30, 2011 at 1:51 PM, Kevin Grittner > wrote: >> I think doing anything in PostgreSQL around this beyond allowing >> DBAs to trust their server clocks is insane. The arguments for >> using and trusting ntpd is pretty much identical to the arguments >> for using and trusting the OS file systems. > > Except that implementing our own file system would likely have more > benefit and be less work than implementing our own time > synchronization, at least if we want it to be reliable. > > Again, I am not trying to pretend that this is any great shakes. > MySQL's version of this feature apparently does somehow compensate for > time skew, which I assume must mean that their replication works > differently than ours - inter alia, it probably requires a TCP socket > connection between the servers. Since we don't require that, it > limits our options in this area, but also gives us more options in > other areas. Still, if I could think of a way to do this that didn't > depend on time synchronization, then I'd be in favor of eliminating > that requirement. I just can't; and I'm inclined to think it isn't > possible. > > I wouldn't be opposed to having an option to try to detect time skew > between the master and the slave and, say, display that information in > pg_stat_replication. It might be useful to have that data for > monitoring purposes, and it probably wouldn't even be that much code. > However, I'd be a bit hesitant to use that data to "correct" the > amount of time we spend waiting for time-delayed replication, because > it would doubtless be extremely imprecise compared to real time > synchronization, and considerably more error-prone. IOW, what you > said. I agree with Robert. It's difficult to implement time-synchronization feature which can deal with all the cases, and I'm not sure if that's really worth taking our time. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Fri, Jul 1, 2011 at 2:25 AM, Robert Haas wrote: >> Some actions aren't even transactional, such as DROP DATABASE, amongst > > Good point. We'd probably need to add a timestamp to the drop > database record, as that's a case that people would likely want to > defend against with this feature. This means that recovery_target_* code would also need to deal with DROP DATABASE case. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks, v4
On Thu, Jun 30, 2011 at 6:28 PM, Thom Brown wrote: > On 27 June 2011 15:13, Robert Haas wrote: >> I didn't get a lot of comments on my the previous version of my patch >> to accelerate table locks. >> >> http://archives.postgresql.org/pgsql-hackers/2011-06/msg00953.php >> >> Here's a new version anyway. In this version, I have: >> >> 1. Made pg_locks show fast-path locks as well as regular locks, with a >> new Boolean column to distinguish. >> 2. Renamed fpLWLock to backendLock, on the belief that we're going to >> want something like this for quite a bit more than just this one >> patch. >> 3. Removed some debugging code. > > Are you able to benchmark this again, and possibly get Stefan to give > it a go too? I probably can, but there's not going to be any significant difference. The patch hasn't changed much. Note that the version where Stefan saw a big performance regression when he pounded the server was actually a combination of this patch and the fast vxid locks patch. If we only apply this one, I believe there will still be enough lock manager contention to stifle the worst of those effects. Of course, even if we apply both, there's only going to be a regression on servers with >32 cores, I think, and even then only when the number of clients is greater than the number of cores. Everyone else will see a benefit. I think that the next thing for me to tackle is the SInvalReadLock spinlock contention, but what I'm really hurting for is some code review. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch file questions?
On 06/30/2011 07:13 PM, Casey Havenor wrote: When I run the patch I'm getting some Hunk fails? Is this because I'm not using the same version that the author intended the patch for? Yes. Welcome to the world of patch bitrot. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch file questions?
Ok - loaded up Linux - fun stuff. Figured out how to get the code PostgreSQL version 9.0.4 - Have that in a directory. Got the patch from above... Placed into same directory. Loaded the dependents for the final compile and install with .configure. BUT When I run the patch I'm getting some Hunk fails? Is this because I'm not using the same version that the author intended the patch for? >From the output it looks to be going through the files properly but just no able to inset the new code? I've tried ignoring white space and even tried a force - of course I backed everything up prior to this. :) - Warmest regards, Casey Havenor -- View this message in context: http://postgresql.1045698.n5.nabble.com/Patch-file-questions-tp4536031p4540442.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add support for logging current role (what to review?)
* Alex Hunsaker (bada...@gmail.com) wrote: > I think if Stephen was proposing 10 fields, or if there was a list of > fields we were planning on adding in the next release or 3, it might > be worth re-factoring. I know of at least one person (in an earlier piece of the thread discussing this patch) who was talking about other fields they'd like included in the CSV log which aren't currently. I don't recall what that was though, but I think it might have been something like line # from inside stored procedures.. > I know of no such list, and I think this field > useful/important enough that people who are using csv logging would > want it anyway. +1 on just tacking on the field and causing a flag day > for csv users. Honestly, I think it was *me* who raised the issue that we don't have a header for CSV logs and that it sucks for people using CSV files. We've changed it in the past (application_name was added, iirc) and there wasn't much noise of it that I recall. If everyone's happy with that, it's fine by me. I do want to rework the logging infrastructure (as discussed in the dev meeting), but I see that whole thing as rather orthogonal to this change. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] reducing the overhead of frequent table locks, v4
On 27 June 2011 15:13, Robert Haas wrote: > I didn't get a lot of comments on my the previous version of my patch > to accelerate table locks. > > http://archives.postgresql.org/pgsql-hackers/2011-06/msg00953.php > > Here's a new version anyway. In this version, I have: > > 1. Made pg_locks show fast-path locks as well as regular locks, with a > new Boolean column to distinguish. > 2. Renamed fpLWLock to backendLock, on the belief that we're going to > want something like this for quite a bit more than just this one > patch. > 3. Removed some debugging code. Are you able to benchmark this again, and possibly get Stefan to give it a go too? Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Make the visibility map crash-safe.
On Thu, 2011-06-30 at 07:50 -0400, Robert Haas wrote: > I compare the performance of commit > 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (post-patch) with commit > 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (pre-patch). I believe that is a copy/paste error. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v6
On Thu, Jun 30, 2011 at 11:44 AM, Robert Haas wrote: > On Thu, Jun 30, 2011 at 11:18 AM, Merlin Moncure wrote: >>> I think the basic problem is that the cache pages are so large. It's >>> hard to make them smaller because that increases the cost of accessing >>> the cache, as you already noted, but at the same time, making an >>> eviction decision on a cache that holds 64K entries based on 100 data >>> points seems like waving around a pretty large-caliber weapon in a >>> fairly uncontrolled fashion. Maybe it works OK 90% of the time, but >>> it's hard to avoid the niggling fear that someone might accidentally >>> get shot. >> >> Right...it's essentially a rolling statistical sampling of transaction >> IDs based on past acceses. Going from say, 100 to 1000 will probably >> drop your errror margin quite a bit but I really wonder how benefit >> you can get from going past that. > > An interesting idea might be to forcibly cause a replacement every 100 > tuples (or every 1000 tuples) and see if that causes a noticeable > performance regression. If it doesn't, that's a good data point. To test this I disabled the cache check on top of HeapTupleSatisfiesMVCC and forced a cache entry in place of it with a bogus xid (so every tuple scanned was treated as a 'miss'. Scanning 1 million records in memory over and over went up a few percentage points -- converging on about 280ms instead of 270ms. This is with bumped to 1000 miss array: oprofile said: regular hinted tuple case: 120 10.2041 advance_aggregates 107 9.0986 heapgettup_pagemode 776.5476 ExecProject 746.2925 heapgetpage 726.1224 ExecProcNode 726.1224 ExecScan 695.8673 advance_transition_function 665.6122 heap_getnext 584.9320 HeapTupleSatisfiesMVCC hinted tuple with pathological cache entry: 111 8.9588 advance_aggregates 109 8.7974 heapgettup_pagemode 856.8604 ExecProject 816.5375 heapgetpage 776.2147 ExecScan 705.6497 ExecStoreTuple 705.6497 HeapTupleSatisfiesMVCC this was a fairly short profiling run but the numbers are fairly consistent. the replacement is fairly cheap...sorting 1000 ints doesn't take all that long. 100 is even faster. > I think the sour spot for this whole idea is likely to be the case > where you have a workload that is 90% read and 10% write, or something > like that. On an all-read workload (like pgbench -S), you're > hopefully going to converge to a state where all the hint-bits are > set, once VACUUM kicks in. And on an all-write workload I think that > you might lose the effect in the noise. Not sure if we have an easy > way to set up such a workload with pgbench, though. it's trivial to test with pgbench -- just use the random number facility to generate an id for some table and update where random() > .9. However this does not generate nearly enough 'misses' to exercise cache replacement in any meaningful way. My workstation vm can punch out about 5k transactions/sec. With only 10% getting a new xid and writing back a tuple to the table only 500 xids are getting generated/second. At that rate it takes quite a while to burn through the 256k transactions the cache ranges over. Also this case is not bound by scan performance anyways making profiling it a little silly -- HeapTupleSatisfiesMVCC is simply not as big of a bottleneck in OLTP workloads. Scan heavy loads is where this matters, and scan heavy loads naturally tend to generate less xids per tuple scanned. >>> Just for the sake of argument, let's suppose we made an array with 64K >>> elements, each one representing one 64K chunk of the XID space. Each >>> element is a 4-byte unsigned integer, so the array takes up 256kB of >>> memory... probably not good, but I'm just thinking out loud here. >>> Every time we're asked about an XID, we bump the corresponding >>> counter. Periodically (say, every 64K probes) we zip through all the >>> counters and consider performing a cache replacement. We look at the >>> not-already-cached page that has the highest counter value and compare >>> it to the counter value for the cached pages. If it is higher and the >>> difference exceeds some safety margin (to protect against hysteresis), >>> then we do the replacement. I think something like this would allow >>> you to have a lot more information available to make a direct >>> apples-to-apples comparison at cache replacement time, as compared >>> with what you have now. >> >> yeah -- what you've done here is basically two things: 1. by mapping >> static ranges you get to skip the sort (but not the scan) during the >> replacement, and 2. by vastly expanding the sampling size you >> eliminate thrashing via noise in the rolling sample. This comes at a >> significant memory cost and loss of some nimbleness in the cache (i'm >> not completely sure more aggressive replacement is 'bad') -- although >> that could be mitigated by replacin
Re: [HACKERS] [COMMITTERS] pgsql: Enable CHECK constraints to be declared NOT VALID
On Thu, 2011-06-30 at 15:09 -0400, Alvaro Herrera wrote: > I don't think we have a policy for this, but I have done it for some > time now and nobody has complained, so I sort of assumed it was okay. > Besides, some of the people pouring the money in does care about it; > moreover, it provides a little incentive for other companies that > might also be in a position to fund development but lack the "peer > approval" of the idea, or a final little push. > > So what's the general opinion here? +1 for adding sponsor name to the commit message. It will encourage companies more. -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [COMMITTERS] pgsql: Enable CHECK constraints to be declared NOT VALID
On Jun 30, 2011, at 12:09 PM, Alvaro Herrera wrote: > Robert Hass (whose name I misspelled in the commit message above) just > mentioned to me (in an answer to my apologizing about it) that he didn't > think that mentioning sponsors for patch development was a good idea. > > I don't think we have a policy for this, but I have done it for some > time now and nobody has complained, so I sort of assumed it was okay. > Besides, some of the people pouring the money in does care about it; > moreover, it provides a little incentive for other companies that might > also be in a position to fund development but lack the "peer approval" > of the idea, or a final little push. > > So what's the general opinion here? I certainly see no harm in it, and contributors at all levels -- including sponsors of new features or fixes -- ought to be acknowledged and thanked. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Enable CHECK constraints to be declared NOT VALID
Excerpts from Alvaro Herrera's message of jue jun 30 11:58:09 -0400 2011: > Enable CHECK constraints to be declared NOT VALID > > [...] > > This patch was sponsored by Enova Financial. Robert Hass (whose name I misspelled in the commit message above) just mentioned to me (in an answer to my apologizing about it) that he didn't think that mentioning sponsors for patch development was a good idea. I don't think we have a policy for this, but I have done it for some time now and nobody has complained, so I sort of assumed it was okay. Besides, some of the people pouring the money in does care about it; moreover, it provides a little incentive for other companies that might also be in a position to fund development but lack the "peer approval" of the idea, or a final little push. So what's the general opinion here? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Thu, Jun 30, 2011 at 1:51 PM, Kevin Grittner wrote: > I think doing anything in PostgreSQL around this beyond allowing > DBAs to trust their server clocks is insane. The arguments for > using and trusting ntpd is pretty much identical to the arguments > for using and trusting the OS file systems. Except that implementing our own file system would likely have more benefit and be less work than implementing our own time synchronization, at least if we want it to be reliable. Again, I am not trying to pretend that this is any great shakes. MySQL's version of this feature apparently does somehow compensate for time skew, which I assume must mean that their replication works differently than ours - inter alia, it probably requires a TCP socket connection between the servers. Since we don't require that, it limits our options in this area, but also gives us more options in other areas. Still, if I could think of a way to do this that didn't depend on time synchronization, then I'd be in favor of eliminating that requirement. I just can't; and I'm inclined to think it isn't possible. I wouldn't be opposed to having an option to try to detect time skew between the master and the slave and, say, display that information in pg_stat_replication. It might be useful to have that data for monitoring purposes, and it probably wouldn't even be that much code. However, I'd be a bit hesitant to use that data to "correct" the amount of time we spend waiting for time-delayed replication, because it would doubtless be extremely imprecise compared to real time synchronization, and considerably more error-prone. IOW, what you said. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
Kevin, > I think doing anything in PostgreSQL around this beyond allowing > DBAs to trust their server clocks is insane. The arguments for > using and trusting ntpd is pretty much identical to the arguments > for using and trusting the OS file systems. Oh, you don't want to implement our own NTP? Coward! ;-) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
Josh Berkus wrote: > when we last had the argument about time synchronization, > Kevin Grittner (I believe) pointed out that unsynchronized > replication servers have an assortment of other issues ... like > any read query involving now(). I don't remember making that point, although I think it's a valid one. What I'm sure I pointed out is that we have one central router which synchronizes to a whole bunch of atomic clocks around the world using the normal "discard the outliers and average the rest" algorithm, and then *every singe server and workstation on our network synchronizes to that router*. Our database servers are all running on Linux using ntpd. Our monitoring spams us with email if any of the clocks falls outside nominal bounds. (It's been many years since we had a misconfigured server which triggered that.) I think doing anything in PostgreSQL around this beyond allowing DBAs to trust their server clocks is insane. The arguments for using and trusting ntpd is pretty much identical to the arguments for using and trusting the OS file systems. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On 6/30/11 10:25 AM, Robert Haas wrote: > So I think keeping it defined it terms of time is the > right way forward, even though the need for external time > synchronization is, certainly, not ideal. Actually, when we last had the argument about time synchronization, Kevin Grittner (I believe) pointed out that unsynchronized replication servers have an assortment of other issues ... like any read query involving now(). As the person who originally brought up this hurdle, I felt that his argument defeated mine. Certainly I can't see any logical way to have time delay in the absence of clock synchronization of some kind. Also, I kinda feel like this discussion seems aimed at overcomplicating a feature which only a small fraction of our users will ever use.Let's keep it as simple as possible. As for delay on streaming replication, I'm for it. I think that post-9.1, thanks to pgbasebackup, the number of our users who are doing archive log shipping is going to drop tremendously. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Thu, Jun 30, 2011 at 6:45 AM, Simon Riggs wrote: > The only way to control this is with a time delay that can be changed > while the server is running. A recovery.conf parameter doesn't allow > that, so another way is preferable. True. We've talked about making the recovery.conf parameters into GUCs, which would address that concern (and some others). > I think the time problems are more complex than said. The patch relies > upon transaction completion times, but not all WAL records have a time > attached to them. Plus you only used commits anyway, not sure why. For the same reason we do that with the recovery_target_* code - replaying something like a heap insert or heap update doesn't change the user-visible state of the database, because the records aren't visible anyway until the commit record is replayed. > Some actions aren't even transactional, such as DROP DATABASE, amongst Good point. We'd probably need to add a timestamp to the drop database record, as that's a case that people would likely want to defend against with this feature. > others. Consecutive records can be hours apart, so it would be > possible to delay on some WAL records but then replay records that > happened minutes ago, then wait hours for the next apply. So this > patch doesn't do what it claims in all cases. > > Similar discussion on max_standby_delay covered exactly that ground > and went on for weeks in 9.0. IIRC I presented the same case you just > did and we agreed in the end that was not acceptable. I'm not going to > repeat it. Please check the archives. I think this case is a bit different. First, max_standby_delay is relevant for any installation using Hot Standby, whereas this is a feature that specifically involves time. Saying that you have to have time synchronization for Hot Standby to work as designed is more of a burden than saying you need time synchronization *if you want to use the time-delayed recovery feature*. Second, and maybe more importantly, no one has come up with an idea for how to make this work reliably in the presence of time skew. Perhaps we could provide a simple time-skew correction feature that would work in the streaming case (though probably not nearly as well as running ntpd), but as I understand your argument, you're saying that most people will want to use this with archiving. I don't see how to make that work without time synchronization. In the max_standby_delay case, the requirement is that queries not get cancelled too aggressively while at the same time letting the standby get too far behind the master, which leaves some flexibility in terms of how we actually make that trade-off, and we eventually found a way that didn't require time synchronization, which was an improvement. But for a time-delayed standby, the requirement at least AIUI is that the state of the standby lag the master by a certain time interval, and I don't see any way to do that without comparing slave timestamps with master timestamps. If we can find a similar clever trick here, great! But I'm not seeing how to do it. Now, another option here is to give up on the idea of a time-delayed standby altogether and instead allow the standby to lag the master by a certain number of WAL segments or XIDs. Of course, if we do that, then we will not have a feature called "time-delayed standbys". Instead, we will have a feature called "standbys delayed by a certain number of WAL segments (or XIDs)". That certainly caters to some of the same use cases, but I think it severely lacking in the usability department. I bet the first thing most people will do is to try to figure out how to translate between those metrics and time, and I bet we'll get complaints on systems where the activity load is variable and therefore the time lag for a fixed WAL-segment lag or XID-lag is unpredictable. So I think keeping it defined it terms of time is the right way forward, even though the need for external time synchronization is, certainly, not ideal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Thu, Jun 30, 2011 at 1:00 PM, Josh Berkus wrote: > On 6/30/11 2:00 AM, Simon Riggs wrote: Manual (or scripted) intervention is always necessary if you reach disk >> 100% full. >>> > >>> > Wow, that's a pretty crappy failure mode... but I don't think we need >>> > to fix it just on account of this patch. It would be nice to fix, of >>> > course. >> How is that different to running out of space in the main database? >> >> If I try to pour a pint of milk into a small cup, I don't blame the cup. > > I have to agree with Simon here. ;-) > > We can do some things to make this easier for administrators, but > there's no way to "solve" the problem. And the things we could do would > have to be advanced optional modes which aren't on by default, so they > wouldn't really help the DBA with poor planning skills. Here's my > suggestions: > > 1) Have a utility (pg_archivecleanup?) which checks if we have more than > a specific settings's worth of archive_logs, and breaks replication and > deletes the archive logs if we hit that number. This would also require > some way for the standby to stop replicating *without* becoming a > standalone server, which I don't think we currently have. > > 2) Have a setting where, regardless of standby_delay settings, the > standby will interrupt any running queries and start applying logs as > fast as possible if it hits a certain number of unapplied archive logs. > Of course, given the issues we had with standby_delay, I'm not sure I > want to complicate it further. > > I think we've already fixed the biggest issue in 9.1, since we now have > a limit on the number of WALs the master will keep if archiving is > failing ... yes? That's the only big *avoidable* failure mode we have, > where a failing standby effectively shuts down the master. I'm not sure we changed anything in this area for 9.1. Am I wrong? wal_keep_segments was present in 9.0. Using that instead of archiving is a reasonable way to bound the amount of disk space that can get used, at the cost of possibly needing to rebuild the standby if things get too far behind. Of course, in any version, you could also use an archive_command that will remove old files to make space if the disk is full, with the same downside: if the standby isn't done with those files, you're now in for a rebuild. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
On Thu, Jun 30, 2011 at 11:55 AM, Noah Misch wrote: > On Wed, Jun 29, 2011 at 09:42:06AM -0400, Robert Haas wrote: >> On Tue, Jun 28, 2011 at 3:40 PM, Noah Misch wrote: > >> > Here's the call stack in question: >> > >> > ? ? ? ?RelationBuildLocalRelation >> > ? ? ? ?heap_create >> > ? ? ? ?index_create >> > ? ? ? ?DefineIndex >> > ? ? ? ?ATExecAddIndex >> > >> > Looking at it again, it wouldn't bring the end of the world to add a >> > relfilenode >> > argument to each. None of those have more than four callers. >> >> Yeah. Those functions take an awful lot of arguments, which suggests >> that some refactoring might be in order, but I still think it's >> cleaner to add another argument than to change the state around >> after-the-fact. >> >> > ATExecAddIndex() >> > would then call RelationPreserveStorage() before calling DefineIndex(), >> > which >> > would in turn put things in a correct state from the start. ?Does that seem >> > appropriate? ?Offhand, I do like it better than what I had. >> >> I wish we could avoid the whole death-and-resurrection thing >> altogether, but off-hand I'm not seeing a real clean way to do that. >> At the very least we had better comment it to death. > > I couldn't think of a massive amount to say about that, but see what you think > of this level of commentary. > > Looking at this again turned up a live bug in the previous version: if the old > index file were created in the current transaction, we would wrongly remove > its > delete-at-abort entry as well as its delete-at-commit entry. This leaked the > disk file. Fixed by adding an argument to RelationPreserveStorage() > indicating > which kind to remove. Test case: > > BEGIN; > CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(n); > CREATE INDEX ti ON t(n); > SELECT pg_relation_filepath('ti'); > ALTER TABLE t ALTER n TYPE int; > ROLLBACK; > CHECKPOINT; > -- file named above should be gone > > I also updated the ATPostAlterTypeCleanup() variable names per discussion and > moved IndexStmt.oldNode to a more-natural position in the structure. On first blush, that looks a whole lot cleaner. I'll try to find some time for a more detailed review soon. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On 6/30/11 2:00 AM, Simon Riggs wrote: >>> Manual (or scripted) intervention is always necessary if you reach disk >>> >> 100% full. >> > >> > Wow, that's a pretty crappy failure mode... but I don't think we need >> > to fix it just on account of this patch. It would be nice to fix, of >> > course. > How is that different to running out of space in the main database? > > If I try to pour a pint of milk into a small cup, I don't blame the cup. I have to agree with Simon here. ;-) We can do some things to make this easier for administrators, but there's no way to "solve" the problem. And the things we could do would have to be advanced optional modes which aren't on by default, so they wouldn't really help the DBA with poor planning skills. Here's my suggestions: 1) Have a utility (pg_archivecleanup?) which checks if we have more than a specific settings's worth of archive_logs, and breaks replication and deletes the archive logs if we hit that number. This would also require some way for the standby to stop replicating *without* becoming a standalone server, which I don't think we currently have. 2) Have a setting where, regardless of standby_delay settings, the standby will interrupt any running queries and start applying logs as fast as possible if it hits a certain number of unapplied archive logs. Of course, given the issues we had with standby_delay, I'm not sure I want to complicate it further. I think we've already fixed the biggest issue in 9.1, since we now have a limit on the number of WALs the master will keep if archiving is failing ... yes? That's the only big *avoidable* failure mode we have, where a failing standby effectively shuts down the master. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Jun 30, 2011, at 9:34 AM, Jeff Davis wrote: > Then how do you get a text range that doesn't correspond to the > LC_COLLATE setting? You cast it. > Does that mean you couldn't dump/reload from a > system with one collation and get the same values in a system with a > different collation? That would be very strange. No, pg_dump should always explicitly cast things. But there should be a reasonable default behavior if I'm in psql and don't cast. > Or, what about other types that just happen to have multiple useful > total orders? Cast where you need it explicit, and have a reasonable default when it's not cast. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Jun 30, 2011, at 9:29 AM, Jeff Davis wrote: > Right. In that respect, it's more like a record type: many possible > record types exist, but you only define the ones you want. Well, okay. How is this same problem handled for RECORD types, then? >> By default, no range types would exists I believe. > > I was planning to include _some_ by default. Probably not text ranges, > but integer and timestamp[tz] ranges. If nothing else, it makes it > easier to document. +1 David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v6
On Thu, Jun 30, 2011 at 11:18 AM, Merlin Moncure wrote: >> I think the basic problem is that the cache pages are so large. It's >> hard to make them smaller because that increases the cost of accessing >> the cache, as you already noted, but at the same time, making an >> eviction decision on a cache that holds 64K entries based on 100 data >> points seems like waving around a pretty large-caliber weapon in a >> fairly uncontrolled fashion. Maybe it works OK 90% of the time, but >> it's hard to avoid the niggling fear that someone might accidentally >> get shot. > > Right...it's essentially a rolling statistical sampling of transaction > IDs based on past acceses. Going from say, 100 to 1000 will probably > drop your errror margin quite a bit but I really wonder how benefit > you can get from going past that. An interesting idea might be to forcibly cause a replacement every 100 tuples (or every 1000 tuples) and see if that causes a noticeable performance regression. If it doesn't, that's a good data point. I think the sour spot for this whole idea is likely to be the case where you have a workload that is 90% read and 10% write, or something like that. On an all-read workload (like pgbench -S), you're hopefully going to converge to a state where all the hint-bits are set, once VACUUM kicks in. And on an all-write workload I think that you might lose the effect in the noise. Not sure if we have an easy way to set up such a workload with pgbench, though. >> Just for the sake of argument, let's suppose we made an array with 64K >> elements, each one representing one 64K chunk of the XID space. Each >> element is a 4-byte unsigned integer, so the array takes up 256kB of >> memory... probably not good, but I'm just thinking out loud here. >> Every time we're asked about an XID, we bump the corresponding >> counter. Periodically (say, every 64K probes) we zip through all the >> counters and consider performing a cache replacement. We look at the >> not-already-cached page that has the highest counter value and compare >> it to the counter value for the cached pages. If it is higher and the >> difference exceeds some safety margin (to protect against hysteresis), >> then we do the replacement. I think something like this would allow >> you to have a lot more information available to make a direct >> apples-to-apples comparison at cache replacement time, as compared >> with what you have now. > > yeah -- what you've done here is basically two things: 1. by mapping > static ranges you get to skip the sort (but not the scan) during the > replacement, and 2. by vastly expanding the sampling size you > eliminate thrashing via noise in the rolling sample. This comes at a > significant memory cost and loss of some nimbleness in the cache (i'm > not completely sure more aggressive replacement is 'bad') -- although > that could be mitigated by replacing at more frequent intervals. Well, that gets at an interesting question: how often SHOULD we replace cache pages? My gut is that it's 10K-100K and your gut is that it's 100-1000, which is a hundred-fold difference. Seems like we need some kind of emprical data to decide what actually works well in real life. > Your range counting strategy might work -- I think to do it right so > that you don't have to map the entire range is going to require two > levels of ranges such that you activate a very wide range first before > looking at 64k subranges. That way you could reduce memory consumption > significantly without sorting during the replacement. I don't think you need two levels of ranges. Just keep track of the minimum and maximum XID covered by the array (these must always be 64M transactions apart, say, if the array has 1024 entries, and each cache page covers 64K XIDs). When you see a given XID, and it's between the minimum and the maximum, then bump the counter in bucket XID/64K. If you see an XID that is older than the minimum, then ignore it; we won't consider devoting cache pages to really old XIDs. If you see an XID that is newer than the maximum, then just increase the minimum and maximum by enough to put the new XID in range. For every 64K increment by which you advance the minimum and maximum, you need to zero one entry in the array (since it's no longer covering the same portion of the XID space) but all the other entries remain intact. (There is a small problem here of how to work out how to initially set the minimum and maximum to something sensible, and to make sure that they don't get out of whack if a backend sits idle for a few billion transactions, but that seems like it should be solvable.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Wed, 2011-06-29 at 12:34 -0400, Robert Haas wrote: > But now that I'm thinking about this a little more, I'm worried about this > case: > > CREATE TABLE foo AS RANGE('something'::funkytype, 'somethingelse'::funktype); > DROP TYPE funkytype; > > It seems to me that the first statement had better fail, or else the > second one is going to create a hopeless mess (imagine that a new type > comes along and gets the OID of funkytype). Interesting point. I don't think it's a problem because pseudo-types can't be used that way, so that provides us a mechanism to stop it. But it means that we have to be a little more sure that such values can't persist anywhere. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Wed, 2011-06-29 at 10:15 -0700, David E. Wheeler wrote: > On Jun 29, 2011, at 10:13 AM, Florian Pflug wrote: > > > Because there might be more than one range type for a > > base type. Say there are two range types over text, one > > with collation 'de_DE' and one with collation 'en_US'. > > What would the type of > > range('foo', 'f') > > be? > > The one that corresponds to the current LC_COLLATE setting. Then how do you get a text range that doesn't correspond to the LC_COLLATE setting? Does that mean you couldn't dump/reload from a system with one collation and get the same values in a system with a different collation? That would be very strange. Or, what about other types that just happen to have multiple useful total orders? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Thu, 2011-06-30 at 09:11 +0200, Florian Pflug wrote: > > How would the system catalogs be initialized under that theory: surely > > you're not going to seed (nr. of types) * (nr. of collations) * (nr. of > > opclasses) range types in initdb? > > There's CREATE RANGE. Right. In that respect, it's more like a record type: many possible record types exist, but you only define the ones you want. > By default, no range types would exists I believe. I was planning to include _some_ by default. Probably not text ranges, but integer and timestamp[tz] ranges. If nothing else, it makes it easier to document. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating CHECK constraints as NOT VALID
Excerpts from Robert Haas's message of sáb jun 18 23:53:17 -0400 2011: > I agree. That's pretty contorted. How about something like this: > Thanks Jaime and Robert. I have pushed this patch with these fixes. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
On Wed, Jun 29, 2011 at 09:42:06AM -0400, Robert Haas wrote: > On Tue, Jun 28, 2011 at 3:40 PM, Noah Misch wrote: > > Here's the call stack in question: > > > > ? ? ? ?RelationBuildLocalRelation > > ? ? ? ?heap_create > > ? ? ? ?index_create > > ? ? ? ?DefineIndex > > ? ? ? ?ATExecAddIndex > > > > Looking at it again, it wouldn't bring the end of the world to add a > > relfilenode > > argument to each. None of those have more than four callers. > > Yeah. Those functions take an awful lot of arguments, which suggests > that some refactoring might be in order, but I still think it's > cleaner to add another argument than to change the state around > after-the-fact. > > > ATExecAddIndex() > > would then call RelationPreserveStorage() before calling DefineIndex(), > > which > > would in turn put things in a correct state from the start. ?Does that seem > > appropriate? ?Offhand, I do like it better than what I had. > > I wish we could avoid the whole death-and-resurrection thing > altogether, but off-hand I'm not seeing a real clean way to do that. > At the very least we had better comment it to death. I couldn't think of a massive amount to say about that, but see what you think of this level of commentary. Looking at this again turned up a live bug in the previous version: if the old index file were created in the current transaction, we would wrongly remove its delete-at-abort entry as well as its delete-at-commit entry. This leaked the disk file. Fixed by adding an argument to RelationPreserveStorage() indicating which kind to remove. Test case: BEGIN; CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(n); CREATE INDEX ti ON t(n); SELECT pg_relation_filepath('ti'); ALTER TABLE t ALTER n TYPE int; ROLLBACK; CHECKPOINT; -- file named above should be gone I also updated the ATPostAlterTypeCleanup() variable names per discussion and moved IndexStmt.oldNode to a more-natural position in the structure. Thanks, nm diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index a90b4e2..b6fe908 100644 *** a/doc/src/sgml/xindex.sgml --- b/doc/src/sgml/xindex.sgml *** *** 834,846 ALTER OPERATOR FAMILY integer_ops USING btree ADD In a B-tree operator family, all the operators in the family must sort compatibly, meaning that the transitive laws hold across all the data types !supported by the family: if A = B and B = C, then A = !C, and if A < B and B < C, then A < C. For each !operator in the family there must be a support function having the same !two input data types as the operator. It is recommended that a family be !complete, i.e., for each combination of data types, all operators are !included. Each operator class should include just the non-cross-type !operators and support function for its data type. --- 834,848 In a B-tree operator family, all the operators in the family must sort compatibly, meaning that the transitive laws hold across all the data types !supported by the family: if A = B and B = C, then A = C, !and if A < B and B < C, then A < C. Subjecting operands !to any number of implicit casts or binary coercion casts involving only !types represented in the family must not change the result other than to !require a similar cast thereon. For each operator in the family there must !be a support function having the same two input data types as the operator. !It is recommended that a family be complete, i.e., for each combination of !data types, all operators are included. Each operator class should include !just the non-cross-type operators and support function for its data type. *** *** 851,861 ALTER OPERATOR FAMILY integer_ops USING btree ADD by the family's equality operators, even when the values are of different types. This is usually difficult to accomplish when the types have different physical representations, but it can be done in some cases. !Notice that there is only one support function per data type, not one !per equality operator. It is recommended that a family be complete, i.e., !provide an equality operator for each combination of data types. !Each operator class should include just the non-cross-type equality !operator and the support function for its data type. --- 853,865 by the family's equality operators, even when the values are of different types. This is usually difficult to accomplish when the types have different physical representations, but it can be done in some cases. !Implicit casts and binary coercion casts among types represented in the !family must preserve this invariant. Notice that there is only one support !function per data type, not one per equality operator. It is recommended !that a family be complete, i.e., provide a
Re: [HACKERS] hint bit cache v6
On Wed, Jun 29, 2011 at 3:18 PM, Robert Haas wrote: >> With the algorithm as coded, to fully flush the cache you'd have to >> find a series of *unhinted* tuples distributed across minimum of four >> 64k wide page ranges, with the number of tuples being over the 5% >> noise floor. Furthermore, to eject a cache page you have to have more >> hits than a page already in the cache got -- hits are tracked on the >> existing pages and the candidate pages are effectively with the >> existing pages based on hits with the best four picked. Also, is it >> reasonable to expect the cache to help in these types of cases >> anyways? > > *scratches head* > > Well, surely you must need to reset the counters for the pages you've > chosen to include in the cache at some point. If you don't, then > there's a strong likelihood that after some period of time the pages > you have in there will become so firmly nailed into the cache that > they can never be evicted. yup -- hit counts are reset on replacement. > Yeah, I am inclined to think that going very far in that direction is > going to be a big pile of fail. TBH, I'm somewhat surprised that you > managed to get what you already have to work without a performance > regression. That code path is extremely hot, and injecting more > complexity seems like it ought to be a loser. For purposes of this > review, I'm taking it as given that you've tested this carefully, but > I'll admit to lingering skepticism. You are absolutely right to be skeptical. Here's the rub -- it's incredibly difficult to contrive a set of circumstances where the cache is aggressively replaced, and even if you can somehow coerce that to happen, the underlying data is permanently altered so that it can't happen again. Cheating by mucking around with the xid or altering the visibility routines to force replacement isn't really fair. AFAICT, unhinted tuples tend to be both recent and grouped into a fairly narrow transaction range basically always. Sooner or later, at least for tables that matter, a vacuum is going to roll around and smear the bits back into the table. My point is that even for the worst case I could think of -- pgbench continually jamming records into the table, replacements tend to be quite rare. Let's assume though I'm wrong and the pathological case is likely to happen. > I think the basic problem is that the cache pages are so large. It's > hard to make them smaller because that increases the cost of accessing > the cache, as you already noted, but at the same time, making an > eviction decision on a cache that holds 64K entries based on 100 data > points seems like waving around a pretty large-caliber weapon in a > fairly uncontrolled fashion. Maybe it works OK 90% of the time, but > it's hard to avoid the niggling fear that someone might accidentally > get shot. Right...it's essentially a rolling statistical sampling of transaction IDs based on past acceses. Going from say, 100 to 1000 will probably drop your errror margin quite a bit but I really wonder how benefit you can get from going past that. > Just for the sake of argument, let's suppose we made an array with 64K > elements, each one representing one 64K chunk of the XID space. Each > element is a 4-byte unsigned integer, so the array takes up 256kB of > memory... probably not good, but I'm just thinking out loud here. > Every time we're asked about an XID, we bump the corresponding > counter. Periodically (say, every 64K probes) we zip through all the > counters and consider performing a cache replacement. We look at the > not-already-cached page that has the highest counter value and compare > it to the counter value for the cached pages. If it is higher and the > difference exceeds some safety margin (to protect against hysteresis), > then we do the replacement. I think something like this would allow > you to have a lot more information available to make a direct > apples-to-apples comparison at cache replacement time, as compared > with what you have now. yeah -- what you've done here is basically two things: 1. by mapping static ranges you get to skip the sort (but not the scan) during the replacement, and 2. by vastly expanding the sampling size you eliminate thrashing via noise in the rolling sample. This comes at a significant memory cost and loss of some nimbleness in the cache (i'm not completely sure more aggressive replacement is 'bad') -- although that could be mitigated by replacing at more frequent intervals. > Now, the big problem here is that a 256kB array is probably too big, > but because we don't want to blow out the lower-level CPU caches and > also because every time we consider performing a cache replacement > we're going to want to zero the whole thing, and that has to be fast. > But we probably don't really need the array to cover the entire XID > space. vacuum_freeze_min_age defaults to 50 million transactions, and > vacuum_freeze_table_age defaults to 150 million transac
Re: [HACKERS] Small patch for GiST: move childoffnum to child
Hi Jeff, Thank you for review. On Thu, Jun 30, 2011 at 8:50 AM, Jeff Janes wrote: > So I tried provoking situations where this surrounding code section > would get executed, both patched and unpatched. I have been unable to > do so--apparently this code is for an incredibly obscure situation > which I can't induce at will. > Yes, it also seems pretty hard to get this code section executed for me. I'm going to ask Teodor and Oleg about it. -- With best regards, Alexander Korotkov.
Re: [HACKERS] Bug in SQL/MED?
Excerpts from 花田 茂's message of jue jun 30 06:00:23 -0400 2011: > I attached a patch which fixes file_fdw to check required option > (filename) in its validator function. I think that such requirement > should be checked again in PlanForeignScan(), as it had been so far. > Note that this patch requires fdw.patch has been applied. I think it'd be good to keep the error check in fileGetOptions() just to ensure that we don't crash in case a catalog is messed up with, though I'd degrade it to elog(ERROR) from ereport. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
On 11-06-28 02:14 PM, Martin Pihlak wrote: Hmm, I thought I thought about that. There was a check in the original patch: "if (conn->sslRetryBytes || (conn->outCount - remaining)> 0)" So if the SSL retry buffer was emptied it would return 1 if there was something left in the regular output buffer. Or did I miss something there? The issue I saw in the original patch was that at that point in the function, sslRetryBytes could be zero (if the data was sent) but conn->outCount - remaining would be an incorrect value since "remaining" is the number of bytes left to send from sslRetryBuf NOT conn->outBuffer. This is no longer an issue in the updated patch. I will try to take a closer look at your updated patch in the next few days. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] how to call the function--pqCatenateResultError()
2011/6/27 _石头 : > Hello!~ > > Now i encounter a function call problem in PostgreSQL's psql module! > > The situation is as follow: > In ./src/bin/psql/common.c, I want to call the > function pqCatenateResultError(). > Function pqCatenateResultError() is declared in > ./src/interfaces/libpq/libpq-init.h > extern void > pqCatenateResultError(PGresult *res, const char *msg); > and is defined in ./src/interfaces/libpq/fe-exec.c > void > pqCatenateResultError(PGresult *res, const > char *msg) > { > PQExpBufferData errorBuf; > if (!res || !msg) > return; > initPQExpBuffer(&errorBuf); > if (res->errMsg) > appendPQExpBufferStr(&errorBuf, res->errMsg); > appendPQExpBufferStr(&errorBuf, msg); > pqSetResultError(res, errorBuf.data); > termPQExpBuffer(&errorBuf); > } > To call this function in ./common.c, I include 'libpq-init.h' in > ./src/bin/psql/common.h . > As ./common.c include the header file 'common.h'. > But when I use pqCatenateResultError() in ./common.c, It appears > "undefined reference to pqCatenateResultError()" first. > > So I include 'extern void pqCatenateResultError(PGresult *res, const > char *msg);' at the begining of './common.c' . > But this change make no difference to the result. > > I do not know why this happened! Someone hlep me! Thank you. > There is another situation similar to the situation above: > Function PQexec() is declared in ./src/interfaces/libpq/libpq-fe.h and > defined in ./src/interfaces/libpq/fe-exec.c > extern PGresult *PQexec(PGconn *conn, const > char *query); > I can call this function with no error happening! > These two situation puzzled me!~ Are you developing on Windows, by any chance? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix
On Thu, Jun 30, 2011 at 5:47 AM, Peter Geoghegan wrote: > I don't think that the way I've phrased my error messages is > inconsistent with that style guide, excepty perhaps the pipe() > reference, but if you feel it's important to try and use "could not", > I have no objections. I like Fujii's rephrasing - we don't usually mention the name of the system call. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding Japanese README
On Thu, Jun 30, 2011 at 1:49 AM, Hitoshi Harada wrote: > 2011/6/30 Itagaki Takahiro : >> On Thu, Jun 30, 2011 at 09:42, Tatsuo Ishii wrote: >>> BTW I will talk to some Japanese speaking developers about my idea if >>> community agree to add Japanese README to the source tree so that I am >>> not the only one who are contributing this project >>> Now, if someone wanted to set up a web site or Wiki page with translations, that would be up to them. >> >> IMHO, the Wiki approach seems to be reasonable than a README file. >> It will be suitable for adding non-Japanese translations and >> non-core developer can join to translate or fix the docs. > > +1. If we really want to prove the demand, let's start with Wiki, > which is less invasive than README (though I doubt such pages would be > updated finally.) We could consider adding a note to the README files pointing people to an appropriate wiki page for translated versions, so that someone who just checks out the source tree has a place to go and start looking. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hint bit cache v6
On Thu, Jun 30, 2011 at 12:31 AM, Jim Nasby wrote: > Would it be reasonable to keep a second level cache that store individual > XIDs instead of blocks? That would provide protection for XIDs that are > extremely common but don't have a good fit with the pattern of XID ranges > that we're caching. I would expect this to happen if you had a transaction > that touched a bunch of data (ie: bulk load or update) some time ago (so the > other XIDs around it are less likely to be interesting) but not old enough to > have been frozen yet. Obviously you couldn't keep too many XIDs in this > secondary cache, but if you're just trying to prevent certain pathological > cases then hopefully you wouldn't need to keep that many. Maybe, but I think that's probably still papering around the problem. I'd really like to find an algorithm that bounds how often we can flush a page out of the cache to some number of tuples significantly greater than 100. The one I suggested yesterday has that property, for example, although it may have other problems I'm not thinking of. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Make the visibility map crash-safe.
On Tue, Jun 28, 2011 at 12:29 PM, Robert Haas wrote: > On Tue, Jun 28, 2011 at 11:44 AM, Cédric Villemain > wrote: >> out of curiosity, does it affect the previous benchmarks of the feature ? > > I don't think there's much performance impact, because the only case > where we actually have to do any real work is when vacuum comes > through and marks a buffer we're about to update all-visible. It > would probably be good to run some tests to verify that, though. I'll > try to set something up. I did some pgbench runs on the 32-core box Nate Boley gave me access to. I'm not sure that pgbench is the best workload to test this with, but I gave it a shot. I used these settings: checkpoint_segments=64 shared_buffers=8GB synchronous_commit=off checkpoint_completion_target=0.9 I compare the performance of commit 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (post-patch) with commit 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (pre-patch). I tried 3 15-minute pgbench runs each with one client, eight clients, and thirty-two clients, on each branch. I used scale factor 100, reinitializing the entire cluster after each run. one client (prepatch) tps = 616.412009 (including connections establishing) tps = 619.189040 (including connections establishing) tps = 605.357710 (including connections establishing) one client (postpatch) tps = 583.295139 (including connections establishing) tps = 609.236975 (including connections establishing) tps = 597.674354 (including connections establishing) eight clients (prepatch) tps = 2635.459096 (including connections establishing) tps = 2468.973962 (including connections establishing) tps = 2592.889454 (including connections establishing) eight clients (postpatch) tps = 2602.226439 (including connections establishing) tps = 2644.201228 (including connections establishing) tps = 2725.760364 (including connections establishing) thirty-two clients (prepatch) tps = 3889.761627 (including connections establishing) tps = 4365.501093 (including connections establishing) tps = 3816.119328 (including connections establishing) thirty-two clients (postpatch) tps = 3888.620044 (including connections establishing) tps = 3614.252463 (including connections establishing) tps = 4090.430296 (including connections establishing) I think it's pretty difficult to draw any firm conclusions from this data. The one and thirty-two core results came out a bit slower; the eight core results came out a bit faster. But the variation between branches is less than the inter-run variation on the same branch, so if there is an effect, this test doesn't clearly capture it. Having thought about it a bit, I think that if we ran this test for long enough, we probably could measure a small effect. pgbench is a very write-intensive test, so anything that writes additional WAL records is going cause checkpoints to happen more frequently, and there's no denying that this change increases WAL volume slightly. On another note, these results are also interesting from a scalability perspective. The eight-core results are about 4.4x the one-core results, and the 32-client results are about 6.5x the one-core results. Needless to say, there's a lot of room for improvement there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery
On 2011-06-30 09:39, Yeb Havinga wrote: 9) as remarked up a different thread, more than one clause could be pushed down a subquery. The current patch only considers alternative paths that have at most one clause pushed down. Is this because of the call site of best_inner_subqueryscan, i.e. under one join rel? Would it be an idea to consider, for each subquery, only a single alternative plan that tries to have all suitable clauses pushed down? Ehm, please forget this remark, I've mistaken join rel for base rel. -- Yeb -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Wed, Jun 29, 2011 at 7:11 PM, Robert Haas wrote: > I don't really see how that's any different from what happens now. If > (for whatever reason) the master is generating WAL faster than a > streaming standby can replay it, then the excess WAL is going to pile > up someplace, and you might run out of disk space. Time-delaying the > standby creates an additional way for that to happen, but I don't > think it's an entirely new problem. The only way to control this is with a time delay that can be changed while the server is running. A recovery.conf parameter doesn't allow that, so another way is preferable. I think the time problems are more complex than said. The patch relies upon transaction completion times, but not all WAL records have a time attached to them. Plus you only used commits anyway, not sure why. Some actions aren't even transactional, such as DROP DATABASE, amongst others. Consecutive records can be hours apart, so it would be possible to delay on some WAL records but then replay records that happened minutes ago, then wait hours for the next apply. So this patch doesn't do what it claims in all cases. Similar discussion on max_standby_delay covered exactly that ground and went on for weeks in 9.0. IIRC I presented the same case you just did and we agreed in the end that was not acceptable. I'm not going to repeat it. Please check the archives. So, again +1 for the feature, but -1 for the currently proposed implementation, based upon review. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Fix leaky-view problem, part 1
On Wed, Jun 29, 2011 at 05:05:22PM +0100, Kohei KaiGai wrote: > 2011/6/28 Noah Misch : > > On Tue, Jun 28, 2011 at 10:11:59PM +0200, Kohei KaiGai wrote: > > CREATE VIEW a AS SELECT * FROM ta WHERE ac = 5; > > ALTER VIEW a OWNER TO alice; > > CREATE VIEW b AS SELECT * FROM tb WHERE bc = 6; > > ALTER VIEW b OWNER TO bob; > > SELECT * FROM a, b; > > > > Both the ac=5 and the bc=6 quals do run at the same depth despite enforcing > > security for different principals. ?I can't think of a way that one view > > owner > > could use this situation to subvert the security of the other view owner, > > but I > > wanted to throw it out. > > > Even if view owner set a trap in his view, we have no way to reference > variables > come from outside of the view. In above example, even if I added f_leak() into > the definition of VIEW A, we cannot give argument to reference VIEW B. Good point. Yes, it should be rigorously safe on that account. > > I was referring to this paragraph: > > > > ?On the technical side, I am pretty doubtful that the approach of adding a > > ?nestlevel to FuncExpr and RelOptInfo is the right way to go. ?I believe we > > ?have existing code (to handle left joins) that prevents quals from being > > ?pushed down too far by fudging the set of relations that are supposedly > > needed > > ?to evaluate the qual. ?I suspect a similar approach would work here. > > > It seems to me the later half of this paragraph is talking about the problem > of > unexpected qualifier pushing-down over the security barrier; I'm trying to > solve > the problem with the part.2 patch. > The scenario the part.1 patch tries to solve is order to launch qualifiers, > not > unexpected pushing-down. Okay, you're probably correct that it wasn't referring to the topic at hand. I'm still suspicious of the silent assumption about how quals can be assigned to plan nodes, but I don't have any concrete ideas for avoiding that. > >> In addition, implementation will become complex, if both of qualifiers > >> pulled-up > >> from security barrier view and qualifiers pulled-up from regular views are > >> mixed > >> within a single qualifier list. > > > > I only scanned the part 2 patch, but isn't the bookkeeping already > > happening for > > its purposes? ?How much more complexity would we get to apply the same > > strategy > > to the behavior of this patch? > > > If conditional, what criteria we should have to reorder the quelifier? > The current patch checks the depth at first, then it checks cost if same > deptn. > It is quite simple rule. I have no idea of the criteria to order the > mixed qualifier > come from security-barrier views and regular views. Let's see. Every qual list will have some depth d such that all quals having depth >= d are security-relevant, and all others are not security-relevant. (This does not hold for all means of identifying security-relevant quals, but it does hold for the CREATE SECURITY VIEW/reloptions strategy proposed in your part 2 patch.) Suppose you track whether each Query node represents a security view, then only increment the qualifier depth for such Query nodes, rather than all Query nodes. The tracked depth then becomes a security partition depth. Keep the actual sorting algorithm the same. (Disclaimer: I haven't been thinking about this nearly as long as you have, so I may be missing something relatively obvious.) As it stands, the patch badly damages the performance of this example: CREATE FUNCTION expensive(int) RETURNS boolean LANGUAGE sql AS 'SELECT pg_sleep(1); SELECT true' COST 100; CREATE TABLE t(c) AS SELECT * FROM generate_series(1,3); EXPLAIN ANALYZE SELECT * FROM (SELECT * FROM t WHERE expensive(c)) t0 WHERE c = 2; That doesn't even use a view, let alone a security view. While I like the patch's current simplicity, we need to narrow its impact. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Jun29, 2011, at 17:41 , Jeff Davis wrote: >> Is it? That's actually too bad, since I kinda like it. But anyway, >> if that's a concern it could also be >> range_bounds(ARRAY[1,2]::int8range, '(]') > > What type would the result of that be? What value? ARRAY[1,2]::int8range would return an int8range instance representing the range [1,2] (i.e. the set of values {1,2}). range_bounds would then modify the left bound to be exclusive, and thus return the int8range (1,2] (i.e. the set of values {2}). range_bounds would have the signature range_bounds(anyrange) returns anyrange. I do think we'll probably want to have functions to modify the boundary type (open or closed) anyway, so it wouldn't be that huge of a deal if the range constructor didn't let you specify them. Empty ranges would be constructed by ARRAY[]::int8range (Hm, ok, now I'm cheating... Currently you'd need to write ARRAY[]::int8[]::int8range, but fixing that only needs a few lines in the transformExpression* function that makes ARRAY[]::int8[] work). >>> * It still suffers similar problems as casting back and forth to text: >>> ANYARRAY is too general, doesn't really take advantage of the type >>> system, and not a great fit anyway. >> >> I believe it alleviates the gravest problems of casting back and forth >> to text. It doesn't have quoting issues and it doesn't potentially lose >> information. > > I think it still circumvents the type system to a degree. We're just > putting stuff in an array with no intention of really using it that way. Well, arrays are containers, and we need two values to construct a range, so putting them into a container first and then creating the range from that doesn't seem so bad to me. We do use the full set of features that arrays provide, since we only ever expect zero, one or two entries. But I don't think this is different from functions who only support single-dimensional arrays - they too choose to use only a subset of the features set of arrays. >> In any case, I wouldn't expect this to *stay* the only way to construct >> a range forever. But I does have it's virtues for a first incarnation of >> range type, I believe, mostly because it's completely unintrusive and >> won't cause any backwards-compatbility headaches in the future > > I'm not sure that your overloading of arrays is completely immune from > backwards-compatibility problems, should we decide to change it later. > > But regardless, we have quite a lot of time to make a decision before > 9.2 is released; so let's do it once and do it right. > >> I fear that the intermediate type will turn out to be quite intrusive, >> at least if we try to handle all the corner cases and loose ends. And if >> we don't, I'm concerned that we're painting ourselves into a corner here... > > Can you expand on some of the corner-cases and loose ends you're > concerned about? Does marking it as a pseudotype and making the IO > functions throw exceptions handle them? Hm, I guess. I'm sill no huge fan of RANGEINPUT, but if we prevent it from being used as a column type and from being used as an argument type, then I guess it's workable... Btw, what happened to the idea of making RANGE(...) a special syntactic construct instead of a normal function call? Did we discard that for its intrusiveness, or were there other reasons? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of patch Bugfix for XPATH() if expression returns a scalar value
On Jun30, 2011, at 10:33 , Radosław Smogura wrote: > You may manually enter invalid xml too, You don't need xpath for this. Please give an example, instead of merely asserting I get postgres=# select '<'::xml; ERROR: invalid XML content at character 8 > Much more > In PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 64-bit > I executed > SELECT XMLELEMENT(name "a", XMLATTRIBUTES(''::xml as v)) > and it's looks like I got correct output "" > when I looked with text editor into table files I saw same value. I fail to see your point. XMLATTRIBUTES currently escapes the value unconditionally, which happens to work as expected in this case. But try SELECT XMLELEMENT(name "a", XMLATTRIBUTES('<'::xml as v)) which returns which I guess isn't what one would expect, otherwise one wouldn't have added the cast to xml. I believe the latter example should probably return instead, just as XMLELEMENT(name "a", '<'::xml) returns < But again, this has little to do with the patch at hand, and should therefore be discussed separately. The fact that XPATH's failure to escape it's return value happens to fit XMLATTRIBUTE's failure to not escape it's input value doesn't prove that both functions are correct. Especially not if other functions like XMLELEMENT *don't* escape their input values of they're already of type XML. > But indeed, now PostgreSQL is not type safe against XML type. See > SELECT XMLELEMENT(name "root", '<', (xpath('//text()', > '<'))[1])). That is *exactly* what my other patch fixes, the one you deem undesirable. With that other patch applied, I get xmlelement --- << wich is correct. So here too I fail to see your point. > You found some problem with escaping, but solution is bad, > I think problem lies with XML type, which may be used to hold strings > and proper xml. But it's not *supposed* to hold arbitrary strings. If it was, the input function wouldn't check whether or not the value was valid or not. Do you agree that, for any type, "SELECT value_of_type_X::text::X" should never throw an error? Because that, IMHO quite basic, guarantee is broken without proper escaping of XML values. Try SELECT (XPATH('/text()', '<'::xml))[1]::text::xml on unpatched 9.0. It fails, which IMNSHO is very clearly a bug. > For example above query can't distinguish if > (xpath('//text()', '<'))[1] is xml text, or xml element. Text *is* a kind of element in XML. There are different types of nodes, one of which are is "text node". The fact that it's not surrounded by <..>, doesn't make a different, and doesn't change the quoting rules one bit. > For me adding support for scalar is really good and needed, but escaping is > not. So far, I remain unconvinced by your arguments. What you argue for might be sensible if we *didn't* have an XML type and instead simply had stuff like XPATH() for type TEXT. But we *do* have a type XML, which *does* validate it's input, so I fail to see how allowing values of type XML which *don't* pass that input validation can be anything but a bug. Compare this to the textual types. We've gone through some pain in the past to guarantee that textual types never hold values which aren't valid text in the database's encoding (e.g, we prevent textual values from containing bytes sequences which aren't valid UTF-8 if the database encoding is UTF-8). It follows that we should do the same for XML types. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in SQL/MED?
(2011/06/29 21:23), Albe Laurenz wrote: > If you invoke any of the SQL/MED CREATE or ALTER commands, > the validator function is only called if an option list was given. > > That means that you cannot enforce required options at object creation > time, because the validator function is not always called. > I consider that unexpected an undesirable behaviour. > > Example: > CREATE EXTENSION file_fdw; > CREATE FOREIGN DATA WRAPPER file HANDLER file_fdw_handler VALIDATOR > file_fdw_validator; > CREATE SERVER file FOREIGN DATA WRAPPER file; > CREATE FOREIGN TABLE flat (id integer) SERVER file OPTIONS (format > 'csv'); > SELECT * FROM flat; > ERROR: filename is required for file_fdw foreign tables > > Now file_fdw does not try to enforce the "filename" option, but it > would be nice to be able to do so. > > The attached patch would change the behaviour so that the validator > function > is always called. > > > If that change meets with approval, should file_fdw be changed so that > it > requires "filename" at table creation time? I think this proposal is reasonable. IMHO this fix is enough trivial to be merged into 9.1 release. I attached a patch which fixes file_fdw to check required option (filename) in its validator function. I think that such requirement should be checked again in PlanForeignScan(), as it had been so far. Note that this patch requires fdw.patch has been applied. With this patch, file_fdw rejects commands which eliminate filename option as result. Please see attached sample.txt for detail. BTW, I noticed that current document says just "the validator function is called for CREATE FDW/SERVER/FOREIGN TABLE", and doesn't mention ALTER command or USER MAPPING. I'll post another patch for this issue later. Regards, -- Shigeru Hanada diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 466c015..57e522f 100644 *** a/contrib/file_fdw/file_fdw.c --- b/contrib/file_fdw/file_fdw.c *** file_fdw_validator(PG_FUNCTION_ARGS) *** 215,220 --- 215,230 */ ProcessCopyOptions(NULL, true, other_options); + /* +* filename is a required option. Validity of other options including +* relative ones have been checked in ProcessCopyOptions(). +* Note: We don't care its value even though it might be empty, because +* COPY comand doesn't. +*/ + if (catalog == ForeignTableRelationId && filename == NULL) + ereport(ERROR, + (errmsg("filename is required for file_fdw foreign tables"))); + PG_RETURN_VOID(); } *** fileGetOptions(Oid foreigntableid, *** 286,295 } prev = lc; } - if (*filename == NULL) - ereport(ERROR, - (errcode(ERRCODE_FDW_UNABLE_TO_CREATE_REPLY), -errmsg("filename is required for file_fdw foreign tables"))); *other_options = options; } --- 296,301 *** filePlanForeignScan(Oid foreigntableid, *** 308,313 --- 314,323 /* Fetch options --- we only need filename at this point */ fileGetOptions(foreigntableid, &filename, &options); + if (filename == NULL) + ereport(ERROR, + (errcode(ERRCODE_FDW_UNABLE_TO_CREATE_REPLY), +errmsg("filename is required for file_fdw foreign tables"))); /* Construct FdwPlan with cost estimates */ fdwplan = makeNode(FdwPlan); diff --git a/contrib/file_fdw/input/file_fdw.source b/contrib/file_fdw/input/file_fdw.source index 9ff7235..8d6dfa3 100644 *** a/contrib/file_fdw/input/file_fdw.source --- b/contrib/file_fdw/input/file_fdw.source *** CREATE FOREIGN TABLE tbl () SERVER file_ *** 59,64 --- 59,65 '); -- ERROR CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null ' '); -- ERROR + CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv'); -- ERROR CREATE FOREIGN TABLE agg_text ( a int2, diff --git a/contrib/file_fdw/output/file_fdw.source b/contrib/file_fdw/output/file_fdw.source index 2ba36c9..6cc6746 100644 *** a/contrib/file_fdw/output/file_fdw.source --- b/contrib/file_fdw/output/file_fdw.source *** ERROR: COPY delimiter cannot be newline *** 75,80 --- 75,82 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null ' '); -- ERROR ERROR: COPY null representation cannot use newline or carriage return + CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv'); -- ERROR + ERROR: filename is required for file_fdw foreign tables CREATE FOREIGN TABLE agg_text ( a int2, b float4 postgres=# CREATE FOREIGN TABLE foo () SERVER file; ERROR: filename is required for file_fdw for
Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix
On 30 June 2011 08:58, Heikki Linnakangas wrote: > Here's a WIP patch with some mostly cosmetic changes I've done this far. I > haven't tested the Windows code at all yet. It seems that no-one is > objecting to removing silent_mode altogether, so I'm going to do that before > committing this patch. I'm mostly happy with the changes you've made, but I note: Fujii is of course correct in pointing out that InitPostmasterDeathWatchHandle() should be a static function. s/the implementation, as described in unix_latch.c/the implementation/ - This is my mistake. I see no reason to mention the .c file. Use ctags. Minor niggle, but there is a little errant whitespace at the top of fork_process.c. (wakeEvents & WL_TIMEOUT) != 0 -- I was going to note that this was redundant, but then I remembered that stupid MSVC warning, which I wouldn't have seen here because I didn't use it for my Windows build due to an infuriating issue with winflex (Our own Cygwin built version of flex for windows). I wouldn't have expected that when it was set to build C though. I'm not sure exactly why it isn't necessary in other places where we're (arguably) doing the same thing. On 30 June 2011 07:36, Fujii Masao wrote: > ReleasePostmasterDeathWatchHandle() can call ereport(FATAL) before > StartChildProcess() or BackendStartup() calls on_exit_reset() and resets > MyProcPid. This looks unsafe. If that ereport(FATAL) is unfortunately > called, a process other than postmaster would perform the postmaster's > proc-exit handlers. And that ereport(FATAL) would report wrong pid > when %p is specified in log_line_prefix. What about closing the pipe in > ClosePostmasterPorts() and removing ReleasePostmasterDeathWatchHandle()? Hmm. That is a valid criticism. I'd rather move the ReleasePostmasterDeathWatchHandle() call into ClosePostmasterPorts() though. > http://developer.postgresql.org/pgdocs/postgres/error-style-guide.html > According to the error style guide, I think that it's better to change the > following messages: I don't think that the way I've phrased my error messages is inconsistent with that style guide, excepty perhaps the pipe() reference, but if you feel it's important to try and use "could not", I have no objections. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] time-delayed standbys
On Thu, Jun 30, 2011 at 2:56 AM, Robert Haas wrote: > On Wed, Jun 29, 2011 at 9:54 PM, Josh Berkus wrote: >>> I am not sure exactly how walreceiver handles it if the disk is full. >>> I assume it craps out and eventually retries, so probably what will >>> happen is that, after the standby's pg_xlog directory fills up, >>> walreceiver will sit there and error out until replay advances enough >>> to remove a WAL file and thus permit some more data to be streamed. >> >> Nope, it gets stuck and stops there. Replay doesn't advance unless you >> can somehow clear out some space manually; if the disk is full, the disk >> is full, and PostgreSQL doesn't remove WAL files without being able to >> write files first. >> >> Manual (or scripted) intervention is always necessary if you reach disk >> 100% full. > > Wow, that's a pretty crappy failure mode... but I don't think we need > to fix it just on account of this patch. It would be nice to fix, of > course. How is that different to running out of space in the main database? If I try to pour a pint of milk into a small cup, I don't blame the cup. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/6/30 Yeb Havinga : > On 2011-06-29 19:22, Hitoshi Harada wrote: >> >> Other things are all good points. Thanks for elaborate review! >> More than anything, I'm going to fix the 6) issue, at least to find the >> cause. >> > Some more questions: > 8) why are cheapest start path and cheapest total path in > best_inner_subqueryscan the same? Because best_inner_indexscan has the two. Actually one of them is enough so far. I aligned it as the existing interface but they might be one. > 10) I have a hard time imagining use cases that will actually result in a > alternative plan, especially since not all subqueries are allowed to have > quals pushed down into, and like Simon Riggs pointed out that many users > will write queries like this with the subqueries pulled up. If it is the > case that the subqueries that can't be pulled up have a large overlap with > the ones that are not pushdown safe (limit, set operations etc), there might > be little actual use cases for this patch. I have seen many cases that this planner hack would help significantly, which were difficult to rewrite. Why were they difficult to write? Because, quals on size_m (and they have quals on size_l in fact) are usually very complicated (5-10 op clauses) and the join+agg part itself is kind of subquery in other big query. Of course there were workaround like split the statement to two, filtering size_m then aggregate size_l by the result of the first statement. However, it's against instinct. The reason why planner is in RDBMS is to let users to write simple (as needed) statements. I don't know if the example I raise here is common or not, but I believe the example represents "one to many" relation simply, therefore there should be many users who just don't find themselves currently in the slow query performance. > I think the most important thing for this patch to go forward is to have a > few examples, from which it's clear that the patch is beneficial. What will be good examples to show benefit of the patch? I guess the test case of size_m/size_l shows it. What lacks on the case, do you think? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Review of patch Bugfix for XPATH() if expression returns a scalar value
On Wed, 29 Jun 2011 22:26:39 +0200, Florian Pflug wrote: On Jun29, 2011, at 19:57 , Radosław Smogura wrote: This is review of patch https://commitfest.postgresql.org/action/patch_view?id=565 "Bugfix for XPATH() if expression returns a scalar value" SELECT XMLELEMENT(name root, XMLATTRIBUTES(foo.namespace AS sth)) FROM (SELECT (XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES (XMLELEMENT(name "root", XMLATTRIBUTES('foo; xmlelement - It's clearly visible that value from attribute is "' in attribute values. So if XMLATTRIBUTES, like XMLELEMENT never escaped values which are already of type XML, the following piece of code XMLELEMENT(name "a", XMLATTRIBUTES(''::xml as v)) would produce which, alas, is not well-formed :-( The correct solution, I believe, is to teach XMLATTRIBUTES to only escape those characters which are invalid in attribute values but valid in XML (Probably only '<' and '>' but I'll check). If there are no objects, I'll prepare a separate patch for that. I don't want to include it in this patch because it's really a distinct issue (even modifies a different function). best regards, Florian Pflug You may manually enter invalid xml too, You don't need xpath for this. Much more In PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 64-bit I executed SELECT XMLELEMENT(name "a", XMLATTRIBUTES(''::xml as v)) and it's looks like I got correct output "" when I looked with text editor into table files I saw same value. I will check on last master if I can reproduce this. But indeed, now PostgreSQL is not type safe against XML type. See SELECT XMLELEMENT(name "root", '<', (xpath('//text()', '<'))[1])). You found some problem with escaping, but solution is bad, I think problem lies with XML type, which may be used to hold strings and proper xml. For example above query can't distinguish if (xpath('//text()', '<'))[1] is xml text, or xml element. For me adding support for scalar is really good and needed, but escaping is not. Regards, Radosław Smogura -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix
On 30.06.2011 09:36, Fujii Masao wrote: On Sat, Jun 25, 2011 at 10:41 AM, Peter Geoghegan wrote: Attached is patch that addresses Fujii's third and most recent set of concerns. Thanks for updating the patch! I think that Heikki is currently taking another look at my work, because he indicates in a new message to the list a short time ago that while reviewing my patch, he realised that there may be an independent problem with silent_mode. I will wait for his remarks before producing another version of the patch that incorporates those two small changes. Yes, we should wait for the comments from Heikki. But, I have another comments; Here's a WIP patch with some mostly cosmetic changes I've done this far. I haven't tested the Windows code at all yet. It seems that no-one is objecting to removing silent_mode altogether, so I'm going to do that before committing this patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index a7f5373..155acea 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -10165,7 +10165,7 @@ retry: /* * Wait for more WAL to arrive, or timeout to be reached */ - WaitLatch(&XLogCtl->recoveryWakeupLatch, 500L); + WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, 500L); ResetLatch(&XLogCtl->recoveryWakeupLatch); } else diff --git a/src/backend/port/unix_latch.c b/src/backend/port/unix_latch.c index 6dae7c9..1a2e141 100644 --- a/src/backend/port/unix_latch.c +++ b/src/backend/port/unix_latch.c @@ -93,6 +93,7 @@ #endif #include "miscadmin.h" +#include "postmaster/postmaster.h" #include "storage/latch.h" #include "storage/shmem.h" @@ -179,31 +180,32 @@ DisownLatch(volatile Latch *latch) * Wait for given latch to be set or until timeout is exceeded. * If the latch is already set, the function returns immediately. * - * The 'timeout' is given in microseconds, and -1 means wait forever. - * On some platforms, signals cause the timeout to be restarted, so beware - * that the function can sleep for several times longer than the specified - * timeout. + * The 'timeout' is given in microseconds. It must be >= 0 if WL_TIMEOUT + * event is given, otherwise it is ignored. On some platforms, signals cause + * the timeout to be restarted, so beware that the function can sleep for + * several times longer than the specified timeout. * * The latch must be owned by the current process, ie. it must be a * backend-local latch initialized with InitLatch, or a shared latch * associated with the current process by calling OwnLatch. * - * Returns 'true' if the latch was set, or 'false' if timeout was reached. + * Returns bit field indicating which condition(s) caused the wake-up. Note + * that if multiple wake-up conditions are true, there is no guarantee that + * we return all of them in one call, but we will return at least one. */ -bool -WaitLatch(volatile Latch *latch, long timeout) +int +WaitLatch(volatile Latch *latch, int wakeEvents, long timeout) { - return WaitLatchOrSocket(latch, PGINVALID_SOCKET, false, false, timeout) > 0; + return WaitLatchOrSocket(latch, wakeEvents, PGINVALID_SOCKET, timeout); } /* - * Like WaitLatch, but will also return when there's data available in - * 'sock' for reading or writing. Returns 0 if timeout was reached, - * 1 if the latch was set, 2 if the socket became readable or writable. + * Like WaitLatch, but with an extra socket argument for WL_SOCKET_* + * conditions. */ int -WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead, - bool forWrite, long timeout) +WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock, + long timeout) { struct timeval tv, *tvp = NULL; @@ -212,19 +214,26 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead, int rc; int result = 0; + Assert(wakeEvents != 0); + + /* Ignore WL_SOCKET_* events if no valid socket is given */ + if (sock == PGINVALID_SOCKET) + wakeEvents &= ~(WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE); + if (latch->owner_pid != MyProcPid) elog(ERROR, "cannot wait on a latch owned by another process"); /* Initialize timeout */ - if (timeout >= 0) + if (wakeEvents & WL_TIMEOUT) { + Assert(timeout >= 0); tv.tv_sec = timeout / 100L; tv.tv_usec = timeout % 100L; tvp = &tv; } waiting = true; - for (;;) + do { int hifd; @@ -235,16 +244,28 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead, * do that), and the select() will return immediately. */ drainSelfPipe(); - if (latch->is_set) + if (latch->is_set && (wakeEvents & WL_LATCH_SET)) { - result = 1; + result |= WL_LATCH_SET; + /* + * Leave loop immediately, avoid blocking again. We don't attempt + * to report any other events that might also be satisfied. + */
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
On 2011-06-29 19:22, Hitoshi Harada wrote: Other things are all good points. Thanks for elaborate review! More than anything, I'm going to fix the 6) issue, at least to find the cause. Some more questions: 8) why are cheapest start path and cheapest total path in best_inner_subqueryscan the same? 9) as remarked up a different thread, more than one clause could be pushed down a subquery. The current patch only considers alternative paths that have at most one clause pushed down. Is this because of the call site of best_inner_subqueryscan, i.e. under one join rel? Would it be an idea to consider, for each subquery, only a single alternative plan that tries to have all suitable clauses pushed down? 10) I have a hard time imagining use cases that will actually result in a alternative plan, especially since not all subqueries are allowed to have quals pushed down into, and like Simon Riggs pointed out that many users will write queries like this with the subqueries pulled up. If it is the case that the subqueries that can't be pulled up have a large overlap with the ones that are not pushdown safe (limit, set operations etc), there might be little actual use cases for this patch. I think the most important thing for this patch to go forward is to have a few examples, from which it's clear that the patch is beneficial. regards, -- Yeb Havinga http://www.mgrid.net/ Mastering Medical Data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in SQL/MED?
I wrote: > If you invoke any of the SQL/MED CREATE or ALTER commands, > the validator function is only called if an option list was given. [...] > Example: [...] The example is misleading. Here a better one: CREATE SERVER myoradb FOREIGN DATA WRAPPER oracle_fdw OPTIONS (foo 'bar'); ERROR: invalid option "foo" HINT: Valid options in this context are: dbserver, user, password but: CREATE SERVER myoradb FOREIGN DATA WRAPPER oracle_fdw; gives no error. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On Jun30, 2011, at 09:05 , Peter Eisentraut wrote: > On tor, 2011-06-30 at 08:45 +0200, Florian Pflug wrote: >> I don't think it will - as it stands, there isn't a single collatable >> type RANGE but instead one *distinct* type per combination of base >> type, btree opclass and collation. The reasons for that were discussed >> at length - the basic argument for doing it that way was to make a >> range represent a fixed set of values. > > How would the system catalogs be initialized under that theory: surely > you're not going to seed (nr. of types) * (nr. of collations) * (nr. of > opclasses) range types in initdb? There's CREATE RANGE. By default, no range types would exists I believe. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types, constructors, and the type system
On tor, 2011-06-30 at 08:45 +0200, Florian Pflug wrote: > I don't think it will - as it stands, there isn't a single collatable > type RANGE but instead one *distinct* type per combination of base > type, btree opclass and collation. The reasons for that were discussed > at length - the basic argument for doing it that way was to make a > range represent a fixed set of values. How would the system catalogs be initialized under that theory: surely you're not going to seed (nr. of types) * (nr. of collations) * (nr. of opclasses) range types in initdb? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding Japanese README
On tor, 2011-06-30 at 09:42 +0900, Tatsuo Ishii wrote: > Why don't you worry about message translations then? Translation is a > human process and there's no way to guaranteer the translation is > perfect. At least for message translations we have a process and sophisticated tools that ensure that the translation has been done and when and by whom. If you create additional translation maintenance effort, then if you could integrate that into the existing PO-based system, I'm sure the objections would be less (or different). And another important point: the PO-based system never shows outdated translations to the user. But if we're adding another step to the release preparation where someone has to chase down half a dozen people to update some file, that creates a lot more burden and bottlenecks, including the possibility to hold up minor releases or having to remove files in minor releases. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers