Re: [HACKERS] nested transactions
On Thu, 28 Nov 2002 21:46:09 -0500, Tom Lane [EMAIL PROTECTED] wrote: Manfred suggested a separate log file (pg_subclog or some such) but I really don't see any operational advantage to that. You still end up with 4 bytes per transaction, you're just assuming that putting them in a different file makes it better. I don't see how. There are two points: 1) If your site/instance/application/whatever... does not use nested transactions or does use them only occasionally, you don't have to pay the additional I/O cost. 2) If we update a subtransaction's pg_clog bits as soon as the status of the main transaction is known, pg_subtrans is only visited once per subtransaction, while pg_clog has to be looked up once per tuple. Things might look different however, if we wrap every command into a subtransaction... Servus Manfred ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Is current_user a function ?
On 28 Nov 2002 11:34:49 -0500 Rod Taylor [EMAIL PROTECTED] wrote: Force the system to use it as a function. select current_user(); On Thu, 28 Nov 2002 17:20:59 -0500 Tom Lane [EMAIL PROTECTED] wrote: As for some current_*** functions, select current_user; seems to work, but select current_user(); doesn't . Complain to the SQL spec authors --- they mandated this peculiar keyword syntax for what is really a function call. Since current_user() can be used in 7.2, I have thought it would work in 7.3 too. I now understand it doesn't work any more -- as well, session_user(), user(), current_date(), current_time(), current_timestamp() and etc. Thank you, Rod and Tom. Regards, Masaru Sugawara ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Auto Vacuum Daemon (again...)
On Thursday 28 November 2002 23:26, Shridhar Daithankar wrote: On 28 Nov 2002 at 10:45, Tom Lane wrote: Matthew T. O'Connor [EMAIL PROTECTED] writes: interesting thought. I think this boils down to how many knobs do we need to put on this system. It might make sense to say allow upto X concurrent vacuums, a 4 processor system might handle 4 concurrent vacuums very well. This is almost certainly a bad idea. vacuum is not very processor-intensive, but it is disk-intensive. Multiple vacuums running at once will suck more disk bandwidth than is appropriate for a background operation, no matter how sexy your CPU is. I can't see any reason to allow more than one auto-scheduled vacuum at a time. Hmm.. We would need to take care of that as well.. Not sure what you mean by that, but it sounds like the behaviour of my AVD (having it block until the vacuum command completes) is fine, and perhaps preferrable. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Auto Vacuum Daemon (again...)
On 29 Nov 2002 at 7:59, Matthew T. O'Connor wrote: On Thursday 28 November 2002 23:26, Shridhar Daithankar wrote: On 28 Nov 2002 at 10:45, Tom Lane wrote: This is almost certainly a bad idea. vacuum is not very processor-intensive, but it is disk-intensive. Multiple vacuums running at once will suck more disk bandwidth than is appropriate for a background operation, no matter how sexy your CPU is. I can't see any reason to allow more than one auto-scheduled vacuum at a time. Hmm.. We would need to take care of that as well.. Not sure what you mean by that, but it sounds like the behaviour of my AVD (having it block until the vacuum command completes) is fine, and perhaps preferrable. Right.. But I will still keep option open for parallel vacuum which is most useful for reusing tuples in shared buffers.. And stale updated tuples are what causes performance drop in my experience.. You know.. just enough rope to hang themselves..;-) Bye Shridhar -- Auction:A gyp off the old block. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] nested transactions
On Friday 29 November 2002 00:56, Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: But we already have a recycling mechanism for pg_clog. AFAICS, creating a parallel log file with a separate recycling mechanism is a study in wasted effort. But that recycling requires the vacuum of every database in the system. Do people do that frequently enough? Once the auto vacuum code is in there, they won't have any choice ;-) OK, I know postgres needs to be vacuumed every so often (I think its to guarantee safe XID wraparound?) I think the AVD should do something to guarnatee this is hapening. Since I am working on AVD, what are the criterea for this? From the above I assume it also pertains to pg_clog recycling (which is related to XID wraparound?), but I know nothing about that. Right now AVD only performs vacuum analyze on specific tables as it deems they need it, it does not perform vacuum on entire databases at any point yet. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] nested transactions
Manfred Koizar [EMAIL PROTECTED] writes: 1) If your site/instance/application/whatever... does not use nested transactions or does use them only occasionally, you don't have to pay the additional I/O cost. As I already said to Bruce, designing this facility on the assumption that it will be seldom-used is a recipe for failure. Everybody and his brother wants commands that don't abort the whole transaction. As soon as this facility exists, you can bet that the standard mode of operation will become one subtransaction per interactive command. If you don't design it to support that load, you may as well not bother to build it at all. 2) If we update a subtransaction's pg_clog bits as soon as the status of the main transaction is known, pg_subtrans is only visited once per subtransaction, while pg_clog has to be looked up once per tuple. How you figure that? It seems to me the visit rate is exactly the same, you've just divided it into two files. Having to touch two files instead of one seems if anything worse. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] nested transactions
Matthew T. O'Connor [EMAIL PROTECTED] writes: Right now AVD only performs vacuum analyze on specific tables as it deems they need it, it does not perform vacuum on entire databases at any point yet. See http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/routine-vacuuming.html However I think that only talks about XID wraparound and datfrozenxid. pg_clog recycling is driven off the oldest datvacuumxid in pg_database; the AVD should think about launching a database-wide vacuum whenever age(datvacuumxid) exceeds a million or two. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] nested transactions
Manfred Koizar wrote: One more argument for pg_subtrans being visible to all backends: If an UPDATE is about to change a tuple touched by another active transaction, it waits for the other transaction to commit or abort. We must always wait for the main transaction, not the subtrans. This issue kills the idea that we can get away with providing lookup to the other backends _only_ while we are twiddling the clog bits. Other transactions are going to need to know if the XID they see on the tuple is owned by an active backend. This means we have to provide child/master xid lookup during the transaction, meaning we may as well use pg_clog or separate file, especially if we can get autovacuum for 7.4. It kills the idea that somehow locking would work. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Query performance. 7.2.3 Vs. 7.3
At 09:58 PM 11/28/02 -0500, you wrote: Hm. Are we sure that both versions were built with the same optimization level, etc? (My private bet is that Wade's 7.2 didn't have multibyte or locale support --- but that's a long shot when we don't know the datatypes of the columns being joined on...) Also, is it expected that the cardinality estimates for join steps won't be very accurate, right? (estimated: 19 rows, actual: 765 rows) OK, I've updated the link http://arch.wavefire.com/72v73a.txt to include the table schema for those involved in the query. As far as locale suport et al, I can tell you that both are built using a straigh, out-of-the-box ./configure. Well, it'd be nice to do better --- I was hoping Wade would look into why the row estimates were off so much. I'd love to :). But where to start? Can you point me at a thread where a similar procedure was explained to someone else? -Wade regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Tightening selection of default sort/group operators
I noticed that the system is really pretty shaky about how it chooses the datatype-specific operators to implement sorting and grouping. In the GROUP BY case, for example, the parser looks up an operator named '' for the column datatype, and then sometime later the executor looks up an operator named '=' for that datatype, and we blithely assume that these operators play together and have the expected semantics. This seems dangerous in a world of user-definable operators. (I think it's already broken by the standard datatype tinterval, in fact, because tinterval's = operator doesn't have the semantics of full equality.) What I'm thinking of doing instead is always looking up the = operator by name, and accepting this as actually being equality if it is marked mergejoinable or hashjoinable or has eqsel() as its restriction selectivity estimator (oprrest). If we are looking for a operator to implement sorting/grouping, then we require = to be mergejoinable, and we use its lsortop operator (regardless of name). The only standard datatypes for which this would change the behavior are tinterval, path, lseg, and line --- none of which could be sorted/grouped correctly with the available operators, anyhow. User-defined datatypes would stop working as sort/group columns unless the author were careful to mark the equality operator as mergejoinable, but that's a simple addition to the operator definition. Comments, objections? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] How to compile postgres source code in VC++
You can't, at least not until 7.4. You can compile the interfaces like libpq and the binary psql. See win32.mak for that. --- Prasanna Phadke wrote: Can anybody explain me, how to compile postgres source code in VC++. Catch all the cricket action. Download Yahoo! Score tracker -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Aren't lseg_eq and lseg_ne broken?
By chance I just noticed that lseg equality is coded as Datum lseg_eq(PG_FUNCTION_ARGS) { LSEG *l1 = PG_GETARG_LSEG_P(0); LSEG *l2 = PG_GETARG_LSEG_P(1); PG_RETURN_BOOL(FPeq(l1-p[0].x, l2-p[0].x) FPeq(l1-p[1].y, l2-p[1].y) FPeq(l1-p[0].x, l2-p[0].x) FPeq(l1-p[1].y, l2-p[1].y)); } Surely this should be PG_RETURN_BOOL(FPeq(l1-p[0].x, l2-p[0].x) FPeq(l1-p[0].y, l2-p[0].y) FPeq(l1-p[1].x, l2-p[1].x) FPeq(l1-p[1].y, l2-p[1].y)); since I don't think I like this result: regression=# select '[(0, 0), (1, 1)]'::lseg = '[(0, 42), (2, 1)]'::lseg; ?column? -- t (1 row) lseg_ne has the identical bug. Checking the CVS archives, I see that this error dates back to the original Berkeley code, so I'm a bit hesitant to just change it. Is there any possibility that it really should work this way? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] 7.4 Wishlist
Hi guys, Just out of interest, if someone was going to pay you to hack on Postgres for 6 months, what would you like to code for 7.4? My ones are: * Compliant ADD COLUMN * Integrated full text indexes * pg_dump dependency ordering What would you guys do? Even if it isn't feasible right now... Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] eWeek Article
Looks like the eWeek article has been published: http://www.eweek.com/article2/0,3959,732789,00.asp Sorry for sounding like such a dork :) Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] nested transactions
On Thu, 28 Nov 2002 12:59:21 -0500 (EST), Bruce Momjian [EMAIL PROTECTED] wrote: Yes, locking is one possible solution, but no one likes that. One hack lock idea would be to create a subtransaction-only lock, [...] [...] without having to touch the xids in the tuple headers. Yes, you could do that, but we can easily just set the clog bits atomically, From what I read above I don't think we can *easily* set more than one transaction's bits atomically. and it will not be needed --- the tuple bits really don't help us, I think. Yes, this is what I said, or at least tried to say. I just wanted to make clear how this new approach (use the fourth status) differs from older proposals (replace subtransaction ids in tuple headers). OK, we put it in a file. And how do we efficiently clean it up? Remember, it is only to be used for a _brief_ period of time. I think a file system solution is doable if we can figure out a way not to create a file for every xid. I don't want to create one file for every transaction, but rather a huge (sparse) array of parent xids. This array is divided into manageable chunks, represented by files, pg_subtrans_. These files are only created when necessary. At any time only a tiny part of the whole array is kept in shared buffers. This concept is similar or almost equal to pg_clog, which is an array of doublebits. Maybe we write the xid's to a file in a special directory in sorted order, and backends can do a btree search of each file in that directory looking for the xid, and then knowing the master xid, look up that status, and once all the children xid's are updated, you delete the file. Yes, dense arrays or btrees are other possible implementations. But for simplicity I'd do it pg_clog style. Yes, but again, the xid status of subtransactions is only update just before commit of the main transaction, so there is little value to having those visible. Having them visible solves the atomicity problem without requiring long locks. Updating the status of a single (main or sub) transaction is atomic, just like it is now. Here is what is to be done for some operations: BEGIN main transaction: Get a new xid (no change to current behaviour). pg_clog[xid] is still 00, meaning active. pg_subtrans[xid] is still 0, meaning no parent. BEGIN subtransaction: Push current transaction info onto local stack. Get a new xid. Record parent xid in pg_subtrans[xid]. pg_clog[xid] is still 00. ROLLBACK subtransaction: Set pg_clog[xid] to 10 (aborted). Optionally set clog bits for subsubtransactions to 10. Pop transaction info from stack. COMMIT subtransaction: Set pg_clog[xid] to 11 (committed subtrans). Don't touch clog bits for subsubtransactions! Pop transaction info from stack. ROLLBACK main transaction: Set pg_clog[xid] to 10 (aborted). Optionally set clog bits for subtransactions to 10. COMMIT main transaction: Set pg_clog[xid] to 01 (committed). Optionally set clog bits for subtransactions from 11 to 01. Don't touch clog bits for aborted subtransactions! Visibility check by other transactions: If a tuple is visited and its XMIN/XMAX_IS_COMMITTED/ABORTED flags are not yet set, pg_clog has to be consulted to find out the status of the inserting/deleting transaction xid. If pg_clog[xid] is ... 00: transaction still active 10: aborted 01: committed 11: committed subtransaction, have to check parent Only in this last case do we have to get parentxid from pg_subtrans. Now we look at pg_clog[parentxid]. If we find ... 00: parent still active, so xid is considered active, too 10: parent aborted, so xid is considered aborted, optionally set pg_clog[xid] = 10 01: parent committed, so xid is considered committed, optionally set pg_clog[xid] = 01 11: recursively check grandparent(s) ... For brevity the following operations are not covered in detail: . Visibility checks for tuples inserted/deleted by a (sub)transaction belonging to the current transaction tree (have to check local transaction stack whenever we look at a xid or switch to a parent xid) . HeapTupleSatisfiesUpdate (sometimes has to wait for parent transaction) The trick here is, that subtransaction status is immediately updated in pg_clog on commit/abort. Main transaction commit is atomic (just set its commit bit). Status 11 is short-lived, it is replaced with the final status by one or more of - COMMIT/ROLLBACK of the main transaction - a later visibility check (as a side effect) - VACUUM pg_subtrans cleanup: A pg_subtrans_ file covers a known range of transaction ids. As soon as none of these transactions has a pg_clog status of 11, the pg_subtrans_ file can be removed. VACUUM can do this, and it won't even have to
Re: [HACKERS] Aren't lseg_eq and lseg_ne broken?
Tom Lane wrote: By chance I just noticed that lseg equality is coded as Datum lseg_eq(PG_FUNCTION_ARGS) { LSEG *l1 = PG_GETARG_LSEG_P(0); LSEG *l2 = PG_GETARG_LSEG_P(1); PG_RETURN_BOOL(FPeq(l1-p[0].x, l2-p[0].x) FPeq(l1-p[1].y, l2-p[1].y) FPeq(l1-p[0].x, l2-p[0].x) FPeq(l1-p[1].y, l2-p[1].y)); } Surely this should be PG_RETURN_BOOL(FPeq(l1-p[0].x, l2-p[0].x) FPeq(l1-p[0].y, l2-p[0].y) FPeq(l1-p[1].x, l2-p[1].x) FPeq(l1-p[1].y, l2-p[1].y)); Yep, there could be no possible reason to double-test something like the original code does. It must be wrong. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Postgres 7.3 announcement on postgresql.org
There isn't one! Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] nested transactions
Manfred Koizar [EMAIL PROTECTED] writes: Visibility check by other transactions: If a tuple is visited and its XMIN/XMAX_IS_COMMITTED/ABORTED flags are not yet set, pg_clog has to be consulted to find out the status of the inserting/deleting transaction xid. If pg_clog[xid] is ... 00: transaction still active 10: aborted 01: committed 11: committed subtransaction, have to check parent Only in this last case do we have to get parentxid from pg_subtrans. Unfortunately this discussion is wrong. User-level visibility checks will usually have to fetch the parentxid in case 01 as well, because even if the parent is committed, it might not be visible in our snapshot. Snapshots will record only topmost-parent XIDs (because that's what we can find in the PG_PROC array, and anything else would create atomicity problems anyway). So we must chase to the topmost parent before testing visibility. This means that the parentxid will need to be fetched in enough cases that it's quite dubious that pushing it to a different file saves I/O. Also, using a 11 state doubles the amount of pg_clog I/O needed to commit a collection of subtransactions. You have to write 11 as the state of each commitable subtransaction, then commit the parent (write 01 as its state), then go back and change the state of each subtransaction to 01. (Whether this last bit is done as part of parent transaction commit, or during later inspections of the state of the subtransaction, doesn't change the argument.) I think it would be preferable to use only three states: active, aborted, committed. The parent commit protocol is (1) write 10 as state of each aborted subtransaction (this should be done as soon as the subtransaction is known aborted, rather than delaying to parent commit); (2) write 01 as state of parent (this is the atomic commit); (3) write 01 as state of each committed subtransaction. Readers who see 00 must check the parent state; if the parent is committed then they have to go back and recheck the child state (to see if it became aborted after they looked). This halves the write traffic during a commit, at the cost of additional read traffic when subtransaction state is checked in a narrow window after the time of parent transaction commit. I believe it nets out to be faster. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] nested transactions
On Fri, 29 Nov 2002 13:33:28 -0500, Tom Lane [EMAIL PROTECTED] wrote: Unfortunately this discussion is wrong. User-level visibility checks will usually have to fetch the parentxid in case 01 as well, because even if the parent is committed, it might not be visible in our snapshot. Or we don't allow a subtransaction's status to be updated from 11 to 01 until we know, that the main transaction is visible to all active transactions. Didn't check whether this is expensive to find out. At least it should be doable by VACCUM. Snapshots will record only topmost-parent XIDs (because that's what we can find in the PG_PROC array, and anything else would create atomicity problems anyway). So we must chase to the topmost parent before testing visibility. BTW, I think this *forces* us to replace the sub xid with the respective main xid in a tuple header, when we set XMIN/MAX_IS_COMMITTED. Otherwise we'd have to look for the main xid, whenever a tuple is touched. Also, using a 11 state doubles the amount of pg_clog I/O needed to commit a collection of subtransactions. Is a pg_clog page written out to disk each time a bit is changed? I'd expect some locality. I think it would be preferable to use only three states: active, aborted, committed. The parent commit protocol is (1) write 10 as state of each aborted subtransaction (this should be done as soon as the subtransaction is known aborted, rather than delaying to parent commit); (2) write 01 as state of parent (this is the atomic commit); (3) write 01 as state of each committed subtransaction. Readers who see 00 must check the parent state; if the parent is committed then they have to go back and recheck the child state (to see if it became aborted after they looked). Nice idea! This saves the fourth status for future uses (for example, Firebird uses it for two phase commit). OTOH for reasons you mentioned above there's no chance to save parent xid lookups, if we go this way. This halves the write traffic during a commit, at the cost of additional read traffic when subtransaction state is checked in a narrow window after the time of parent transaction commit. I believe it nets out to be faster. Maybe. The whole point of my approach is: If we can limit the active range of transactions requiring parent xid lookups to a small fraction of the range needing pg_clog lookups, then it makes sense to store status bits and parent xids in different files. Otherwise keeping them together in one file clearly is faster. Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] nested transactions
Manfred Koizar wrote: On Fri, 29 Nov 2002 13:33:28 -0500, Tom Lane [EMAIL PROTECTED] wrote: Unfortunately this discussion is wrong. User-level visibility checks will usually have to fetch the parentxid in case 01 as well, because even if the parent is committed, it might not be visible in our snapshot. Or we don't allow a subtransaction's status to be updated from 11 to 01 until we know, that the main transaction is visible to all active transactions. Didn't check whether this is expensive to find out. At least it should be doable by VACCUM. Snapshots will record only topmost-parent XIDs (because that's what we can find in the PG_PROC array, and anything else would create atomicity problems anyway). So we must chase to the topmost parent before testing visibility. BTW, I think this *forces* us to replace the sub xid with the respective main xid in a tuple header, when we set XMIN/MAX_IS_COMMITTED. Otherwise we'd have to look for the main xid, whenever a tuple is touched. Sorry, I don't follow this. As far as I know, we will set the subxid on the tuple so we can independently mark the xact as aborted without revisiting all the tuples. Once it is committed/rolled back, I see no need to lookup the parent, and in fact we could clear the clog parent xid offset so there is no way to access the parent anymore. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Thinking about IN/EXISTS optimization
Tom, I'm just curious, will your proposed in/exists optimizations help for queries like: db=# explain delete from dns_expired_domains where domain_id in (select domain_id from dns_expired_domains group by domain_id having count(*)=14 ); NOTICE: QUERY PLAN: Seq Scan on dns_expired_domains (cost=0.00..55448724329.92 rows=324754 width=6) SubPlan - Materialize (cost=85370.33..85370.33 rows=64951 width=4) - Aggregate (cost=82122.79..85370.33 rows=64951 width=4) - Group (cost=82122.79..83746.56 rows=649508 width=4) - Sort (cost=82122.79..82122.79 rows=649508 width=4) - Seq Scan on dns_expired_domains (cost=0.00..10316.08 rows=649508 width=4) EXPLAIN I usually end up having to make a little script that runs the subquery, splits the domain_id's up in to chunks of 1000 or so, then executes several queries similar to: delete from dns_expired_domains where domain_id in (1,2,3,4,5,6,7,8,9,10...) This method seems to work fairly well and executes in a reasonable amount of time, unlike the original query with an estimated cost of 55,448,724,329.92. I attempted to use EXISTS in the same query but it seemed it wanted to delete all the rows in the table, I wasn't able to get it to delete only the ones that occured 14 times in the table. I may have overlooked something though. In any case, it would definately be nice if a query like this worked efficiently. Thanks, and congrats to all the people involved with the 7.3 release, all your hardwork is greatly appreciated. On Tue, 2002-10-22 at 16:18, Tom Lane wrote: I've been thinking about how to convert x IN (subselect) and EXISTS constructs into join-like processing, and I've run into a small problem in getting the planner to do it nicely. The issue is that I need to take the subselect and push it into the jointree -- essentially, make it look like a subselect-in-FROM -- so that the join planner can deal with it. Basically, I need to rearrange SELECT ... FROM ... WHERE ... AND x IN (SELECT y FROM ...) into SELECT ... FROM ..., (SELECT y FROM ...) ss WHERE ... AND x =* ss.y where =* represents some specially-marked RestrictInfo node. (NOT IN is the same except that the RestrictInfo node will be marked differently.) The difficulty is that there's no good place to do this in subquery_planner(). We should push the subselect into FROM before we run the pull_up_subqueries() and preprocess_jointree() operations; if we don't pull up the subselect into the main query then we won't have accomplished very much. But the WHERE clause isn't simplified into a form that makes it easy to spot top-level IN() expressions until after that. We can't simply switch the order of the subselect and WHERE-clause processing, because pulling up subqueries typically adds conditions to the WHERE clause. I haven't been able to think of a solution to this that doesn't involve wasting a lot of cycles by repeating some of these processing steps, or missing some optimization possibilities. (For example, if we pull up a subquery that came from a view, it might contain an IN where-clause, which ideally we'd want to be able to optimize. It almost seems like we need to be able to loop around the whole operation; but most of the time this will just waste cycles.) Anyone see a nice way to do this? regards, tom lane --(end of broadcast)-- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- Best Regards, Mike Benoit NetNation Communication Inc. Systems Engineer Tel: 604-684-6892 or 888-983-6600 -- Disclaimer: Opinions expressed here are my own and not necessarily those of my employer ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Christopher Kings-Lynne wrote: There isn't one! Chris No kidding. I propose changes to the following pages: http://www14.us.postgresql.org/: Prominent mention of the latest stable and unstable releases (just stable now, of course). But since this seems to be kind of a repeat of the news page, maybe an announcement also. http://www14.us.postgresql.org/news.html: Announcement of each of the betas, release candidates, and the final release now. http://developer.postgresql.org/index.php: Mention that 7.3 has been released, where it says it has entered release candidate phase 2 now. http://developer.postgresql.org/beta.php: page marked as obselete until the 7.4 cycle begins. (Instead of just broken, as it is now.) http://www.postgresql.org/idocs/: Announcement that it will be switching over to the 7.3 docs soon and later switching, like was done with 7.1-7.2 IIRC. Is the website in CVS somewhere? I might put together a patch if so... I mentioned this around RC1 time and got no response. Really, I think that people would have tested it more if they knew about it. And the web page is the first place I go to find out about a piece of software. The mailing lists are not enough IMNSHO. Thanks for a great piece of software...but please tell people about it! Scott ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Thinking about IN/EXISTS optimization
Mike Benoit [EMAIL PROTECTED] writes: I'm just curious, will your proposed in/exists optimizations help for queries like: db=# explain delete from dns_expired_domains where domain_id in (select domain_id from dns_expired_domains group by domain_id having count(*)=14 ); Probably, but I'm more than a tad curious about why you're concerned about the efficiency of this particular example. Why would count=14 be an interesting condition for deleting groups? Seq Scan on dns_expired_domains (cost=0.00..55448724329.92 rows=324754 width=6) SubPlan - Materialize (cost=85370.33..85370.33 rows=64951 width=4) - Aggregate (cost=82122.79..85370.33 rows=64951 width=4) - Group (cost=82122.79..83746.56 rows=649508 width=4) - Sort (cost=82122.79..82122.79 rows=649508 width=4) - Seq Scan on dns_expired_domains (cost=0.00..10316.08 rows=649508 width=4) What are the *actual*, not estimated, row counts here --- ie, how many rows in the table, and how many distinct domain_ids are you typically deleting? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] nested transactions
Manfred Koizar [EMAIL PROTECTED] writes: Maybe. The whole point of my approach is: If we can limit the active range of transactions requiring parent xid lookups to a small fraction of the range needing pg_clog lookups, then it makes sense to store status bits and parent xids in different files. Otherwise keeping them together in one file clearly is faster. Hmm ... I'm not sure that that's possible. But wait a moment. The child xid is by definition always greater than (newer than) its parent. So if we consult pg_clog and find the transaction marked committed, *and* the xid is before the window of XIDs in our snapshot, then even if it's not a top-level xid, the parent must be before our window too. Therefore we can conclude the transaction is visible in our snapshot. So indeed there is a good-size range of xids for which we'll never need to chase the parent link: everything before the RecentGlobalXmin computed by GetSnapshotData. (We do have to set subtransactions to committed during parent commit to make this true; we can't update them lazily. But I think that's okay.) Maybe you're right --- we could probably truncate pg_subtrans faster than pg_clog, and we could definitely expect to keep less of it in memory than pg_clog. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] nested transactions
I am concerned this is getting beyond my capabilities for 7.4 --- anyone want to help? --- Tom Lane wrote: Manfred Koizar [EMAIL PROTECTED] writes: Maybe. The whole point of my approach is: If we can limit the active range of transactions requiring parent xid lookups to a small fraction of the range needing pg_clog lookups, then it makes sense to store status bits and parent xids in different files. Otherwise keeping them together in one file clearly is faster. Hmm ... I'm not sure that that's possible. But wait a moment. The child xid is by definition always greater than (newer than) its parent. So if we consult pg_clog and find the transaction marked committed, *and* the xid is before the window of XIDs in our snapshot, then even if it's not a top-level xid, the parent must be before our window too. Therefore we can conclude the transaction is visible in our snapshot. So indeed there is a good-size range of xids for which we'll never need to chase the parent link: everything before the RecentGlobalXmin computed by GetSnapshotData. (We do have to set subtransactions to committed during parent commit to make this true; we can't update them lazily. But I think that's okay.) Maybe you're right --- we could probably truncate pg_subtrans faster than pg_clog, and we could definitely expect to keep less of it in memory than pg_clog. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Thinking about IN/EXISTS optimization
On Fri, 2002-11-29 at 13:22, Tom Lane wrote: Mike Benoit [EMAIL PROTECTED] writes: I'm just curious, will your proposed in/exists optimizations help for queries like: db=# explain delete from dns_expired_domains where domain_id in (select domain_id from dns_expired_domains group by domain_id having count(*)=14 ); Probably, but I'm more than a tad curious about why you're concerned about the efficiency of this particular example. Why would count=14 be an interesting condition for deleting groups? The count=14 isn't really that significate, basically I'm just looking for faster execution of queries like: (delete|select) from table where id in (select id from large_table2) For cases where EXISTS won't work properly, and large_table2 has more then ~50,000 rows. Seq Scan on dns_expired_domains (cost=0.00..55448724329.92 rows=324754 width=6) SubPlan - Materialize (cost=85370.33..85370.33 rows=64951 width=4) - Aggregate (cost=82122.79..85370.33 rows=64951 width=4) - Group (cost=82122.79..83746.56 rows=649508 width=4) - Sort (cost=82122.79..82122.79 rows=649508 width=4) - Seq Scan on dns_expired_domains (cost=0.00..10316.08 rows=649508 width=4) What are the *actual*, not estimated, row counts here --- ie, how many rows in the table, and how many distinct domain_ids are you typically deleting? 650,000 actual rows in the table. 40,000 or so are returned by the subquery. About 500,000 rows should end up being deleted. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html -- Best Regards, Mike Benoit NetNation Communication Inc. Systems Engineer Tel: 604-684-6892 or 888-983-6600 --- Disclaimer: Opinions expressed here are my own and not necessarily those of my employer ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] 7.4 Wishlist
Christopher Kings-Lynne wrote: Hi guys, Just out of interest, if someone was going to pay you to hack on Postgres for 6 months, what would you like to code for 7.4? Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Well, jokes apart, I think this is one of the most needed features to me. Currently I'm using strange voodoo to replicate some tables on other machines in order to spread load and resilency. Compared to what I am doing now a good master to slave replication would be heaven. I understand that a good replication is painful but in my experience, if you start by integrating some rude, experimental implementation in the mainstream PostgreSQL the rest will come by itself. For example, RI was something I wouldn't consider production level in 7.2, but was a start, now in 7.3 is much much better, probably complete in the most important parts. Other wishes (not as important as the replication issue) are: - Better granularity of security and access control, like in mysql. - Ability to reset the state of an open backend, including aborting open transaction to allow for better connection pooling and reusing, maybe giving the client the ability to switch between users... Bye! -- Daniele Orlandi Planet Srl ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] 7.4 Wishlist
Christopher Kings-Lynne kirjutas R, 29.11.2002 kell 23:51: Hi guys, Just out of interest, if someone was going to pay you to hack on Postgres for 6 months, what would you like to code for 7.4? My ones are: * Compliant ADD COLUMN * Integrated full text indexes * pg_dump dependency ordering What would you guys do? Even if it isn't feasible right now... As I don't have a permanent job starting next year (my main employer went bust), I'm planning to do more on postgreSQL anyway (at least until I run out of money ;) I have done some (or sometimes a lot of) brain-twitching on items in the following list, but very little actual useful coding on most. My personal todo list is: Application server support * better XML integration - XML(*) aggregate function returning XML representation of subquery - XML input/output to/from tables - XML searchable/indexable in fields) * Overhaul of OO features (moving closer to SQL99) - type/table inheritance, table inheritance would be done using SQL99's UNDER and would be single inheritance, stored in single logical table, possibly subdivided in physical tables reusing our current huge table 1GB split mechanisms type inheritance would be done using SQL99's LIKE and would be multiple inheritance and would reuse as much as possible the current code for ADD/DROP/RENAME column - check constraints would apply to both type and table inheritance - pk/fk constraints would apply only to table inheritance - types as base of tables, - study feasibility of reference types, - dynamic invocation of table function on queries over hierarchies * WITH (as part of query/view) * WITH RECURSIVE for recursive queries * better NOTIFY (with optional argument, using shared memory instead of tables) General stuff - * making array types btree-indexable in a general way * study feasibility of using SQL99's ARRAY syntax for arrays Data warehousing * bitmap indexes, - using bitmap indexes internally for star joins - real bitmap indexes - clustered multiple bitmap indexes especially clustering on group of bitmap indexes * clustering in general - specifying pages to be filled only to a certain percentage in clustered tables so that updated tuples can be placed near original ones if needed and parallel vacuum can then reclaim the space and keep table clustered with less shuffling. * OLAP features - WINDOW clause, PARTITION BY - GROUPING SETS, ROLLUP, CUBE, () WAL-based master-slave replication -- * if someone is not doing it (which I hope is not true ;) UNICODE / Localization -- * UTEXT, UCHAR, UVARCHAR types using IBM's ICU, stored in UTF-16 or SCSU * fast LIKE, ILIKE, REGEX code for UTF-16, possibly lifted from python2 * field-level localization, again using ICU FE/BE protocol -- all can be worked on independently * try to find a better wire protocol from existing ones (X-window system seems simple enough, perhaps DB2's DRDA) or fix the existing one for high performance (mainly make sure that as big chunks as possible have preceeding length), make it easy to send out-of-band/optional data (Notifications, info on actual query performance (so one can visualize it for user), ...) * standardize a fire-level binary protocol for field types (currently whatever is stored is sent) * work on making python use this protocol and port some postgres datatypes (initially timestamp/date/time and varbit)to python Really Dark Arts -- * making backend internals available to a scripting language (for me it means python ;) for making more parts (especially planner/optimizer) more easily hackable * using the stuff from previous point ;) And that's all ;) Hannu Krosing ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.4 Wishlist
pg_dump, our upgrade process is painful enough having to do a dump, reload. I think we should be able to guarantee (or at least let much closer to it) that the process works in all cases. Personally pg_upgrade would be even nicer. - Original Message - From: Christopher Kings-Lynne [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, November 29, 2002 1:51 PM Subject: [HACKERS] 7.4 Wishlist Hi guys, Just out of interest, if someone was going to pay you to hack on Postgres for 6 months, what would you like to code for 7.4? My ones are: * Compliant ADD COLUMN * Integrated full text indexes * pg_dump dependency ordering What would you guys do? Even if it isn't feasible right now... Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.4 Wishlist
At 10:51 AM 29/11/2002 -0800, Christopher Kings-Lynne wrote: * pg_dump dependency ordering I've actually started working on pg_dump in the background, but if you want to do it let me know. In terms of things I would like to see: - background/integrated vacuum (not just an overwriting storage manager) - insert/update...returning - function result caches -- assuming I can demonstrate that they are a Good Thing. - COPY TO/FROM with a list of columns (maybe we have it?) - it's useful for making metadata changes then reloading data (Inserts are much slower). Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 03 5330 3172 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.4 Wishlist
At 04:56 PM 29/11/2002 -0800, Christopher Kings-Lynne wrote: - COPY TO/FROM with a list of columns (maybe we have it?) - it's useful for making metadata changes then reloading data (Inserts are much slower). We do already have it in 7.3: Excellent. Then I just need to add support in pg_dump. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 03 5330 3172 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.4 Wishlist
Wow Hannu - your list puts mine to shame! Application server support * better XML integration - XML(*) aggregate function returning XML representation of subquery - XML input/output to/from tables - XML searchable/indexable in fields) I've had thoughts about XML too. Since XML is hierachical, imagine being able to index xml using contrib/ltree or something! ie. We create a new 'xml' column type. We create a new indexing scheme for it based on ltree gist. You index the xml column. Then you can do sort of XPath queries: SELECT * FROM requests WHERE xml_xpath('/request/owner/name', datafield) = 'Bob'; And it would be indexed. Imaging being able to pull up all XML documents that had certain properties, etc. MS-SQL has a SELECT ... FOR XML clause, but we could always just create function called xml_select() or something now that we can return recordsets. Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Christopher Kings-Lynne writes: There isn't one! Has there been a release? It certainly hasn't been announced in the usual places that I monitor. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Locale-dependent case conversion in {identifier}
Comment in {identifier} section in src/backend/parser/scan.l states: [...] * Note: here we use a locale-dependent case conversion, * which seems appropriate under SQL99 rules, whereas * the keyword comparison was NOT locale-dependent. */ And in ScanKeywordLookup() in src/backend/parser/keywords.c: /* * Apply an ASCII-only downcasing. We must not use tolower() since it * may produce the wrong translation in some locales (eg, Turkish), * and we don't trust isupper() very much either. In an ASCII-based * encoding the tests against A and Z are sufficient, but we also * check isupper() so that we will work correctly under EBCDIC. The * actual case conversion step should work for either ASCII or EBCDIC. */ And I happen to have bad luck to use PostgreSQL with Turkish locale. And, as you may know our I is not your I: pgsql=# create table a(x char(1)); CREATE TABLE pgsql=# grant SELECT ON a to PUBLIC; ERROR: user public does not exist pgsql=# Oracle, the second best database I have does seem to convert relation names in locale-dependent fassion: SQL alter session set NLS_LANGUAGE='TURKISH'; Session altered. SQL create table a(x char(1)); Table created. SQL grant select on a to PUBLIC; Grant succeeded. Further, if I try to create a table in oracle using Turkish-specific characters, it is creating it alright, without trying to make them upper-case as it usually does. So I have changed lower-case conversion code in scan.l to make it purely ASCII-based as in keywords.c. Mini-patch is given below. Please bear in mind that it is my first attempt at hacking PostgreSQL code, so there can be some mistakes. Regards, Nick diff -Nur src/backend/parser/scan.l.orig src/backend/parser/scan.l --- src/backend/parser/scan.l.orig Sat Nov 30 02:54:06 2002 +++ src/backend/parser/scan.l Sat Nov 30 02:57:45 2002 @@ -551,9 +551,12 @@ ident = pstrdup(yytext); for (i = 0; ident[i]; i++) { - if (isupper((unsigned char) ident[i])) - ident[i] = tolower((unsigned char) ident[i]); + charch = ident[i]; + if (ch = 'A' ch = 'Z' isupper((unsigned char) ch)) + ch += 'a' - 'A'; + ident[i] = ch; } + ident[i] = '\0'; if (i = NAMEDATALEN) { int len; ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] 7.4 Wishlist
At 05:33 PM 29/11/2002 -0800, Christopher Kings-Lynne wrote: Hmmm. I could have sworn that someone (Neil?) already did that? Not AFAICT - at least based on looking at the manual. I'll check the code. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 03 5330 3172 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
On Fri, 29 Nov 2002, Christopher Kings-Lynne wrote: Glad you liked it. But that doesn't change the fact that it obscured the release to the point that many people didn't even know it was released. I found out by folks complaining about broken links. Hrm - the subject said it all, plus what about the first 2 paragraphs? PostgreSQL Global Development Group Announces Version 7.3 The PostgreSQL Global Development Group proudly announces the release of version 7.3 of the PostgreSQL object-relational database management system (ORDBMS). PostgreSQL, the world's most advanced open source database, provides solutions for many of the most demanding applications in use today, saving businesses and governments millions of dollars each year. Maybe the user comments can be moved until after the 7.3 feature list? First things first. In pine, the announcement looked like this in the index: 17096 Nov 28 PostgreSQL Public (6733) [GENERAL] PostgreSQL Global Developm I see nothing about 7.3 there. When skimming the mailbox, that's what I see. I didn't see the actual message until AFTER went looking for it. When I did find it, this is what I saw when I opened it: - For Immediate Release November 28th, 2002 Contacts: Justin Clift [EMAIL PROTECTED] +61.3 9363 1313 (Australia) Marc Fournier [EMAIL PROTECTED] +1.902 542 0713 (Canada) - Yep, it's gotta be the best one yet. Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
- Original Message - From: Peter Eisentraut [EMAIL PROTECTED] To: Christopher Kings-Lynne [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Friday, November 29, 2002 4:16 PM Subject: Re: [HACKERS] Postgres 7.3 announcement on postgresql.org Christopher Kings-Lynne writes: There isn't one! Has there been a release? It certainly hasn't been announced in the usual places that I monitor. It has been announced on the announce mailing list and in the eweek article... It's downloadable from the advocacy site... Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Tightening selection of default sort/group operators
Tom Lane writes: What I'm thinking of doing instead is always looking up the = operator by name, and accepting this as actually being equality if it is marked mergejoinable or hashjoinable or has eqsel() as its restriction selectivity estimator (oprrest). If we are looking for a operator to implement sorting/grouping, then we require = to be mergejoinable, and we use its lsortop operator (regardless of name). My first thought is that this seems to be an awefully backwards way to define operator semantic metadata. I think we either have to flag operators explicitly (this is the less-than operator), or we just require that = = = have certain semantics. I could be happy with both. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Christopher Kings-Lynne writes: It has been announced on the announce mailing list and in the eweek article... It's downloadable from the advocacy site... Yeah, that'll let the world know. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
On 29 Nov 2002, Ryan Mahoney wrote: Scroll down and read the rest of, it's an excellent announcement! If the website and mirrors made mention of the release (as of 8:19PM CST they don't!) and the message was sent to all the mailing lists, there would probably be less confusion. When I read the announcement I was very impressed and went straight to www.postgresql.org. Once I got there, I wondered if maybe the announcement had been sent by accident! When I'm looking for content in a message if I don't find it in the first few lines or even on the first page, I move on. But like I said, had you really bothered to read it, it did not resemble a traditional release announcement. I don't intend to debate this any further. I've just about filtered the junk out of the announcement and should have it on the website in a few mins. Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Peter Eisentraut [EMAIL PROTECTED] writes: Has there been a release? It certainly hasn't been announced in the usual places that I monitor. Marc claimed he'd put out the announcement on pgsql-announce, but that copy of the message never arrived here (it did show up on pgsql-general though). Evidently you and Vince never got it either ... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Peter Eisentraut [EMAIL PROTECTED] writes: Has there been a release? It certainly hasn't been announced in the usual places that I monitor. Marc claimed he'd put out the announcement on pgsql-announce, but that copy of the message never arrived here (it did show up on pgsql-general though). Evidently you and Vince never got it either ... You're right - I only got it thru -general. Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Tightening selection of default sort/group operators
Peter Eisentraut [EMAIL PROTECTED] writes: My first thought is that this seems to be an awefully backwards way to define operator semantic metadata. Why? The property we are interested in is that two operators '' and '=' will work for grouping --- ie, if you order by '' and then combine adjacent values for which '=' succeeds, you will get sane results. A link between the two pg_operator entries seems a perfectly sensible way to represent that. The problem I've got is that the code doesn't (or didn't, till this afternoon) make use of the available information. I think we either have to flag operators explicitly (this is the less-than operator), or we just require that = = = have certain semantics. I could be happy with both. I'm not totally thrilled with assuming that '=' is the name of the equality operator. It would be cleaner, probably, to add a column to pg_type to point to the datatype's equality operator. However, doing that would pretty much break every existing user-defined type (since they'd not know they need to specify this additional info) and there are some circularity problems as well (operator won't exist yet when you do CREATE TYPE). Given those problems, I'm willing to stick with the existing assumption that '=' names an equality operator for grouping. The main point of this change is to avoid getting burnt by using unrelated '=' and '' operators in a context where they need to play together. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
On Fri, 29 Nov 2002, Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Has there been a release? It certainly hasn't been announced in the usual places that I monitor. Marc claimed he'd put out the announcement on pgsql-announce, but that copy of the message never arrived here (it did show up on pgsql-general though). Evidently you and Vince never got it either ... After alot of searching I did find it. It wasn't exactly what one would expect a PostgreSQL release announcement to look like. Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Marc claimed he'd put out the announcement on pgsql-announce, but that copy of the message never arrived here (it did show up on pgsql-general though). Evidently you and Vince never got it either ... After alot of searching I did find it. It wasn't exactly what one would expect a PostgreSQL release announcement to look like. Huh? I thought it was the best one yet! The quotes, the example cases and large users, links to advocacy and HISTORY. It was excellent. A better emphasis on marketing as well as technical improvements. Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
On Fri, 29 Nov 2002, Christopher Kings-Lynne wrote: Marc claimed he'd put out the announcement on pgsql-announce, but that copy of the message never arrived here (it did show up on pgsql-general though). Evidently you and Vince never got it either ... After alot of searching I did find it. It wasn't exactly what one would expect a PostgreSQL release announcement to look like. Huh? I thought it was the best one yet! The quotes, the example cases and large users, links to advocacy and HISTORY. It was excellent. A better emphasis on marketing as well as technical improvements. Glad you liked it. But that doesn't change the fact that it obscured the release to the point that many people didn't even know it was released. I found out by folks complaining about broken links. Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Glad you liked it. But that doesn't change the fact that it obscured the release to the point that many people didn't even know it was released. I found out by folks complaining about broken links. Hrm - the subject said it all, plus what about the first 2 paragraphs? PostgreSQL Global Development Group Announces Version 7.3 The PostgreSQL Global Development Group proudly announces the release of version 7.3 of the PostgreSQL object-relational database management system (ORDBMS). PostgreSQL, the world's most advanced open source database, provides solutions for many of the most demanding applications in use today, saving businesses and governments millions of dollars each year. Maybe the user comments can be moved until after the 7.3 feature list? Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
FYI, Vince, I started reading all my email (using elm) in a special 120 column wide, 38 row xterm. There was just too much detail in those subjects i was missing. --- Vince Vielhaber wrote: On Fri, 29 Nov 2002, Christopher Kings-Lynne wrote: Glad you liked it. But that doesn't change the fact that it obscured the release to the point that many people didn't even know it was released. I found out by folks complaining about broken links. Hrm - the subject said it all, plus what about the first 2 paragraphs? PostgreSQL Global Development Group Announces Version 7.3 The PostgreSQL Global Development Group proudly announces the release of version 7.3 of the PostgreSQL object-relational database management system (ORDBMS). PostgreSQL, the world's most advanced open source database, provides solutions for many of the most demanding applications in use today, saving businesses and governments millions of dollars each year. Maybe the user comments can be moved until after the 7.3 feature list? First things first. In pine, the announcement looked like this in the index: 17096 Nov 28 PostgreSQL Public (6733) [GENERAL] PostgreSQL Global Developm I see nothing about 7.3 there. When skimming the mailbox, that's what I see. I didn't see the actual message until AFTER went looking for it. When I did find it, this is what I saw when I opened it: - For Immediate Release November 28th, 2002 Contacts: Justin Clift [EMAIL PROTECTED] +61.3 9363 1313 (Australia) Marc Fournier [EMAIL PROTECTED] +1.902 542 0713 (Canada) - Yep, it's gotta be the best one yet. Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Archive links slightly redundant
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 The archives for the following mailing lists appear to have two identical links to November 2002: admin advocacy announce bugs cygwin Here's one of them: http://archives.postgresql.org/pgsql-announce/ Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200211292251 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE96DXCvJuQZxSWSsgRAqtoAKCP3JZLjQj/32w4O76uQh2cMPYqWQCgnB7z p8nreyfDwT04TRiEr5KZhK8= =tEk0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
On Fri, 29 Nov 2002, Bruce Momjian wrote: FYI, Vince, I started reading all my email (using elm) in a special 120 column wide, 38 row xterm. There was just too much detail in those subjects i was missing. Doesn't do me much good if too often I don't have the luxury of a large screen 'cuze I'm reading from a remote site with horrible resolution or just an 80x25 screen. --- Vince Vielhaber wrote: On Fri, 29 Nov 2002, Christopher Kings-Lynne wrote: Glad you liked it. But that doesn't change the fact that it obscured the release to the point that many people didn't even know it was released. I found out by folks complaining about broken links. Hrm - the subject said it all, plus what about the first 2 paragraphs? PostgreSQL Global Development Group Announces Version 7.3 The PostgreSQL Global Development Group proudly announces the release of version 7.3 of the PostgreSQL object-relational database management system (ORDBMS). PostgreSQL, the world's most advanced open source database, provides solutions for many of the most demanding applications in use today, saving businesses and governments millions of dollars each year. Maybe the user comments can be moved until after the 7.3 feature list? First things first. In pine, the announcement looked like this in the index: 17096 Nov 28 PostgreSQL Public (6733) [GENERAL] PostgreSQL Global Developm I see nothing about 7.3 there. When skimming the mailbox, that's what I see. I didn't see the actual message until AFTER went looking for it. When I did find it, this is what I saw when I opened it: - For Immediate Release November 28th, 2002 Contacts: Justin Clift [EMAIL PROTECTED] +61.3 9363 1313 (Australia) Marc Fournier [EMAIL PROTECTED] +1.902 542 0713 (Canada) - Yep, it's gotta be the best one yet. Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Vince Vielhaber wrote: On Fri, 29 Nov 2002, Bruce Momjian wrote: FYI, Vince, I started reading all my email (using elm) in a special 120 column wide, 38 row xterm. There was just too much detail in those subjects i was missing. Doesn't do me much good if too often I don't have the luxury of a large screen 'cuze I'm reading from a remote site with horrible resolution or just an 80x25 screen. Would a better subject line, fitting in the smaller default width, have been something like: PostgreSQL 7.3 Released! by the PostgreSQL Global Development Group So hopefully it would look something like: 17096 Nov 28 PostgreSQL Public (6733) [GENERAL] PostgreSQL 7.3 Released! b Am thinking that regardless of the wording of the release, it doesn't hurt us to do simple things like re-arranging the Subject line to make things a bit more obvious. ? Regards and best wishes, Justin Clift -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
On Sat, 30 Nov 2002, Justin Clift wrote: Vince Vielhaber wrote: On Fri, 29 Nov 2002, Bruce Momjian wrote: FYI, Vince, I started reading all my email (using elm) in a special 120 column wide, 38 row xterm. There was just too much detail in those subjects i was missing. Doesn't do me much good if too often I don't have the luxury of a large screen 'cuze I'm reading from a remote site with horrible resolution or just an 80x25 screen. Would a better subject line, fitting in the smaller default width, have been something like: PostgreSQL 7.3 Released! by the PostgreSQL Global Development Group So hopefully it would look something like: 17096 Nov 28 PostgreSQL Public (6733) [GENERAL] PostgreSQL 7.3 Released! b Am thinking that regardless of the wording of the release, it doesn't hurt us to do simple things like re-arranging the Subject line to make things a bit more obvious. Yes it would. But while on the subject, why did you only mention it's availability being on the advocacy site? Are the ftp and website mirrors now irrelevant to you? Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Locale-dependent case conversion in {identifier}
Nicolai Tufar [EMAIL PROTECTED] writes: So I have changed lower-case conversion code in scan.l to make it purely ASCII-based. as in keywords.c. Mini-patch is given below. Rather than offering a patch, you need to convince us why our reading of the SQL standard is wrong. (Oracle does it that way is not an argument that will carry a lot of weight.) SQL99 states that identifier case conversions are done on the basis of the Unicode upper/lower case equivalences, so it seems clear that they intend more than ASCII-only conversion for identifiers. Locale-based conversion might not be an exact implementation of the spec, but it's surely closer than ASCII-only. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Planning for improved versions of IN/NOT IN
Tom Lane wrote: I've been thinking about how to improve the performance of queries using WHERE x IN (subselect) and WHERE x NOT IN (subselect). In the existing implementation, the subquery result is rescanned to look for a match to x each time the WHERE clause is executed; this essentially makes it work like a nestloop join of the stupidest variety. (We do stick a Materialize node atop the subselect if it looks complicated, but that's not a big help in typical cases.) I've thought of three alternative implementations that would perform better in various scenarios. Each would be relatively simple to implement; the problem I'm having is figuring out how to get the planner to choose the best one. The alternatives are basically: [abbreviated] 1. Add FROM item 2. Hash-based 3. Inner indexscan 4. the existing implementation The difficulty is that it's not clear how to choose one of these four ways, short of planning the *entire* query from scratch all four ways :-(. This seems pretty grim. Approaches #2 and #3 could be handled as local transformations of the WHERE clause, but we couldn't choose which to use very well if we don't yet know how many outer rows the WHERE clause will be executed for. Approach #1 is really planning a completely different query --- one with an extra FROM-item --- and there's not going to be all that much commonality in the computations, unless we restrict where the added FROM-item can be joined to the others, which'd more or less defeat the purpose. Anyone see a way around this difficulty? How about starting with a rule-based method to make the choice? 1. If uncorrelated: use hash-based approach - ISTM this might address a large percentage of the problem cases -- it could even handle the IN (list-of-scalars) case. Could it fall back to a tuplesort/binary-search for the too many to hash in memory case? 2. If correlated: use an inner indexscan 3. If you come up with a pattern where none of the approaches produce a correct answer, use the existing implementation You could always get fancier later if needed, but something along these lines would be a great start. Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgres 7.3 announcement on postgresql.org
Vince Vielhaber wrote: snip Yes it would. But while on the subject, why did you only mention it's availability being on the advocacy site? *We* mentioned it's availability being on the Advocacy site, because it gives people a single place to go that has both PostgreSQL itself *and* a site that's dedicated to giving a clear list of features, advantages, case studies, etc. This Press Release was created to give a clear path to PostgreSQL usage for new users (i.e. initial interest - place to find out about it - advantages, cast studies, etc) Are the ftp and website mirrors now irrelevant to you? Not sure what you mean here. Regards and best wishes, Justin Clift Vince. -- http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio. -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Planning for improved versions of IN/NOT IN
Joe Conway wrote: Tom Lane wrote: I've been thinking about how to improve the performance of queries using WHERE x IN (subselect) and WHERE x NOT IN (subselect). How about starting with a rule-based method to make the choice? 1. If uncorrelated: use hash-based approach - ISTM this might address a large percentage of the problem cases -- it could even handle the IN (list-of-scalars) case. Could it fall back to a tuplesort/binary-search for the too many to hash in memory case? 2. If correlated: use an inner indexscan 3. If you come up with a pattern where none of the approaches produce a correct answer, use the existing implementation You could always get fancier later if needed, but something along these lines would be a great start. I curious if any of the rewriting of EXISTS and NOT EXISTS would address the problem described by Date: http://www.firstsql.com/iexist.htm Mike Mascari [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Planning for improved versions of IN/NOT IN
Mike Mascari [EMAIL PROTECTED] writes: I curious if any of the rewriting of EXISTS and NOT EXISTS would address the problem described by Date: http://www.firstsql.com/iexist.htm We are not here to redefine the SQL spec ... and especially not to eliminate its concept of NULL, which is what Date would really like ;-) The above-quoted screed is based on a claimed logical equivalence between NOT EXISTS() and NOT IN() that is just plain wrong when you consider the possibility of NULLs. Rather than FirstSQL correctly processes this query, you should read FirstSQL deliberately violates the SQL spec. (There may be grounds to argue that the spec behavior could be improved, but that's an argument to be making to the standards committee, not here.) regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] 7.4 Wishlist
On Fri, Nov 29, 2002 at 10:51:26AM -0800, Christopher Kings-Lynne wrote: Just out of interest, if someone was going to pay you to hack on Postgres for 6 months, what would you like to code for 7.4? Well, nobody is paying me, but I want to - fix the btree problem leaking unused pages (I think I'm getting near, I just haven't had free time during the last month). This one is a must to me. - try different regexp algorithms, compare efficiency. Both Henry Spencer's new code for Tcl, and Baeza-Navarro shift-or approach (can be much faster than traditional regex engines) (do people care for allowing search with errors, similar to what agrep and nrgrep do?) -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Nunca confiaré en un traidor. Ni siquiera si el traidor lo he creado yo (Barón Vladimir Harkonnen) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Locale-dependent case conversion in {identifier}
By no means I would try to convince that your reading of the SQL standards is wrong. What I am trying to tell is that Turkish alphabet is broken beyond repair. And since there is absolutely no way to change our alphabet, we may can code a workaround in the code. So i do not claim that your code is wrong. It is behaviang according to specification. But unfortunately folks at SQL99 probably were not aware of the woes of Turkish I. The very special case of letter I in Turkish is not only PostgreSQL's problem. Many java programs have failed miserably trying to open files with Is in pathnames. So basically, there are two letters I in Trukish. The wone is with dot on top and another is without. The with dot on top walways has the dot and the one without never has it. Simple. The problem is with the standard Latin I. So why small i does have a dot and capital I does not? Standard conversion is Lower: I - y' and Y' - i. Upper: y' - I and i - Y'. (font may not be displayed correctly in your mail reader) Historically programs that operate in Turkish locale have chosen to hardcode the capitalisation of i in system messages and identifier names like this: Lower: I - i and Y' - i. Upper: y' - I and i - I. With this, no matter what kind of I you used in names, it is always going to end up a valid ASCII character. Would it be acceptable if I submit a path that applies this special logic in src/backend/parser/scan.l if the locale is tr_TR? Because for many folks setting locale to Turkish would render their database unusable. For, god forbid, if your sql has a column name written in capitlas including I. It is not working. So I deeply believe that PostgreSQL community have to provide a workaround for this problem. So what should I do? Best regards, Nick Tom Lane wrote: Nicolai Tufar [EMAIL PROTECTED] writes: So I have changed lower-case conversion code in scan.l to make it purely ASCII-based. as in keywords.c. Mini-patch is given below. Rather than offering a patch, you need to convince us why our reading of the SQL standard is wrong. (Oracle does it that way is not an argument that will carry a lot of weight.) SQL99 states that identifier case conversions are done on the basis of the Unicode upper/lower case equivalences, so it seems clear that they intend more than ASCII-only conversion for identifiers. Locale-based conversion might not be an exact implementation of the spec, but it's surely closer than ASCII-only. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]