Re: [HACKERS] Current CVS parallel test lock
Do you have a way to revert to the old installation to check whether the checks fail again? It might be useful to track down exactly what happened. It seems wrong that a currently-installed version should have adverse effects on a just-built version's regression tests. No :) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Current CVS parallel test lock
On Wed, Mar 09, 2005 at 03:43:38PM +0800, Christopher Kings-Lynne wrote: Hrm, I just did a gmake install; gmake installcheck - that worked fine. Then I did gmake check again and now that works fine... It must have been picking up something from my previously installed pgsql. Do you have a way to revert to the old installation to check whether the checks fail again? It might be useful to track down exactly what happened. It seems wrong that a currently-installed version should have adverse effects on a just-built version's regression tests. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Cost of XLogInsert CRC calculations
On Mon, Mar 07, 2005 at 11:53:59PM +, Simon Riggs wrote: On Mon, 2005-03-07 at 09:39 -0500, Tom Lane wrote: Mark Cave-Ayland [EMAIL PROTECTED] writes: [skipped] Well, we're using the CRC in 3 separate places... (1) for xlog records (2) for complete blocks copied to xlog (3) for control files For (1), records are so short that probably CRC16 would be sufficient without increasing the error rate noticeably. I think I'd like to keep (3) at CRC64...its just too important. Plus thats slightly less code to change. My money is on (2) being the source of most of that run-time anyway, since when we enclose a whole block it takes a lot longer to CRC64 all BLCKSZ bytes than it would do to CRC a single record in (1). But of course, longer stretches of data need better error detection rates. Well, if there is no need for error recovery, than what about using more simple algorithm, like checksum? Perhaps, it even could be attached to one of required memcpy calls. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] About b-tree usage
On Tue, 2005-03-08 at 15:30 +0200, Ioannis Theoharis wrote: let me, i have turned enable_seqscan to off, in order to discourage optimizer to choose seq_scan whenever an idex_scan can be used. But in this case, why optimizer don't chooses seq_scan (discourage is different than prevent) ? As Michael Paesold pointed out, enable_seqscan=off effectively means that it will never choose a seqscan when there exists an alternative. At many cases i need only a small fragment of raws to be retrieved. But this extreme case is a real-scenario (not the most frequent but real). Returning all the rows in a large table is a real scenario? I would expect you to at least use a cursor. Keep in mind that libpq has to store all those rows in local client memory, so a cursor is a good idea if that is the case. And regardless, if you are returning all the rows in a table, the absolute fastest way possible is a seq scan. An index is much slower because it does way more work, and the disk accesses are random rather than sequential. I try to find a way to achieve good performence even for the extreme case. Is there any way? The extreme case you provided is a SELECT without a WHERE. We already know that PostgreSQL is executing that as efficiently as possible when enable_seqscan=on. Why not send me a simple script that generates some data similar to what you want to access, and then the query you'd like to perform on that data. We might find that PostgreSQL is already executing the query as efficiently as possible; in fact I suspect that's the case (although perhaps some tuning would help). In short, we shouldn't try to use indexes for all situations; they are inherently worse for some situations. That's why PostgreSQL has both seqscans and indexes, and an intelligent planner. ps. In bibliografy, there is a different alternative for indices. except th simple approach of attr_val, rid is the alternative attr_val, set of rids. The second means the attaches to each discrete attr_val the set o rid's of all raws with same attr_val. Is this alternative taken into account in postgres? I don't entirely understand what you're asking. It seems like you're talking about a relatively minor implementation issue, and if it does make a difference, it would probably not be very much. Perhaps rephrase your question? On Mon, 7 Mar 2005, Jeff Davis wrote: In that case, sequential scan is faster, but perhaps the planner doesn't know that ahead of time. Try turning on more statistics if you haven't already, and then run ANALYZE again. If the planner sees a range, perhaps it assumes that it is a highly selective range, when in fact, it consists of all of the tuples. Also, make sure enable_seqscan is true (in case you turned it off for testing or something and forgot). A seqscan is usually faster when a large proportion of the tuples are returned because: (1) It uses sequential I/O; whereas an index might access tuples in a random order. (2) It doesn't have to read the index's disk pages at all. I suspect you don't need to return all the tuples in the table. If you include the details of a real scenario perhaps the people on the list could be more helpful. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] fool-toleranced optimizer
On Wed, 2005-03-09 at 11:02 +1100, Neil Conway wrote: Oleg Bartunov wrote: I just noticed a little optimizer problem - in second query there is unused 'tycho t2' table alias which gets backend buried. It's not an unused table alias, it is specifying the cartesian product of `tycho' with itself. I don't see how this is an optimizer problem: it's a perfectly legitimate query, albeit one that is unlikely to execute very quickly. Turn this thought around a bit and the request makes sense. Oleg is saying that the optimizer doesn't protect against foolish SQL requests. His query is an example of a foolishly written query. It is reasonably common SQL mistake to inadvertently request a cartesian product join, when that was not actually desired. This is mostly prevalent in Data Warehouse situations where people are attempting to request complex result sets. It seems a reasonable that there might be a GUC such as enable_cartesian = on (by default) If an admin felt that this was a problem, they could enable it for their novice users only, or perhaps across the whole system. If enable_cartesian = off, then queries with cartesian product joins would be made to fail. Which should be easy to detect in early stages of optimization. So, Oleg, for me, the request makes sense, though somebody would need to code it... Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] fool-toleranced optimizer
Simon Riggs wrote: Oleg is saying that the optimizer doesn't protect against foolish SQL requests. His query is an example of a foolishly written query. IMHO calling this a foolishly written query is completely arbitrary. I can imagine plenty of applications for which a cartesian join makes sense. In this case the user didn't write the query they meant to write -- but it is surely hopeless to prevent that in the general case :) It seems a reasonable that there might be a GUC such as enable_cartesian = on (by default) I think the bar for adding a new GUC ought to be significantly higher than that. In any case, when this problem does occur, it is obvious to the user that something is wrong, and no harm is done. Given a complex SQL query, it might take a bit of examination to determine which join clause is missing -- but the proper way to fix that is better query visualization tools (perhaps similar RH's Visual Explain, for example). This would solve the general problem: the user didn't write the query they intended to write, rather than a very narrow subset (the user forgot a join clause and accidentally computed a cartesian product). -Neil ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] fool-toleranced optimizer
Oleg, this idea doesn't seem destine for greatness, so it might be worth adding that you can avoid the general case problem of incorrectly- specified-but-long-running query by using statement_timeout... On Wed, 2005-03-09 at 22:38 +1100, Neil Conway wrote: Simon Riggs wrote: Oleg is saying that the optimizer doesn't protect against foolish SQL requests. His query is an example of a foolishly written query. IMHO calling this a foolishly written query is completely arbitrary. Well, in this case foolish is defined by the person that wrote the query, as an expression of regret. I can imagine plenty of applications for which a cartesian join makes sense. Yes, which is why I discussed using a GUC, set only by those people who want to be protected *from themselves*. It's a safety harness that you could choose to put on if you wished. In this case the user didn't write the query they meant to write -- but it is surely hopeless to prevent that in the general case :) It seems a reasonable that there might be a GUC such as enable_cartesian = on (by default) I think the bar for adding a new GUC ought to be significantly higher than that. Well, the point is moot until somebody writes the rest of the code anyhow. So, add it to the ideas shelf... In any case, when this problem does occur, it is obvious to the user that something is wrong, and no harm is done. Given a complex SQL query, it might take a bit of examination to determine which join clause is missing -- but the proper way to fix that is better query visualization tools (perhaps similar RH's Visual Explain, for example). This would solve the general problem: the user didn't write the query they intended to write, rather than a very narrow subset (the user forgot a join clause and accidentally computed a cartesian product). This issue only occurs when using SQL as the user interface language, which is common when using a database in iterative or exploratory mode e.g. Data Warehousing. If you are using more advanced BI tools then they seldom get the SQL wrong. This is not useful in a situation where people are writing SQL for a more static application. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] fool-toleranced optimizer
On Wed, 9 Mar 2005, Simon Riggs wrote: Oleg, this idea doesn't seem destine for greatness, so it might be worth adding that you can avoid the general case problem of incorrectly- specified-but-long-running query by using statement_timeout... I have no problem with that ! I just wanted to take a note of such could be mistaken errors. On Wed, 2005-03-09 at 22:38 +1100, Neil Conway wrote: Simon Riggs wrote: Oleg is saying that the optimizer doesn't protect against foolish SQL requests. His query is an example of a foolishly written query. IMHO calling this a foolishly written query is completely arbitrary. Well, in this case foolish is defined by the person that wrote the query, as an expression of regret. I can imagine plenty of applications for which a cartesian join makes sense. Yes, which is why I discussed using a GUC, set only by those people who want to be protected *from themselves*. It's a safety harness that you could choose to put on if you wished. In this case the user didn't write the query they meant to write -- but it is surely hopeless to prevent that in the general case :) It seems a reasonable that there might be a GUC such as enable_cartesian = on (by default) I think the bar for adding a new GUC ought to be significantly higher than that. Well, the point is moot until somebody writes the rest of the code anyhow. So, add it to the ideas shelf... In any case, when this problem does occur, it is obvious to the user that something is wrong, and no harm is done. Given a complex SQL query, it might take a bit of examination to determine which join clause is missing -- but the proper way to fix that is better query visualization tools (perhaps similar RH's Visual Explain, for example). This would solve the general problem: the user didn't write the query they intended to write, rather than a very narrow subset (the user forgot a join clause and accidentally computed a cartesian product). This issue only occurs when using SQL as the user interface language, which is common when using a database in iterative or exploratory mode e.g. Data Warehousing. If you are using more advanced BI tools then they seldom get the SQL wrong. This is not useful in a situation where people are writing SQL for a more static application. Best Regards, Simon Riggs Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression tests to fail
Dear all, After struggling for one week to to integrate FreeBSD's vfprintf.c into PostgreSQL I finally gave up. It is too dependent on underlying FreeBSD system functions. To incorporate it into PostgreSQL we need to move vfprintf.c file itself, two dozen files form gdtoa and a half a dozen __XXtoa.c files scattered in apparently random fashion all around FreeBSD source tree. Instead I researched some other implementations of snprintf on the web released under a license compatible with PostgreSQL's. The most suitable one I have come upon is Trio [http://daniel.haxx.se/projects/trio/]. It is distributed under a MIT-like license which, I think will be compatible with us. What do you think about it? Shall I abandon FreeBSD and go ahead ncorporatng Tro? And by the way, what s the concluson of snprntf() vs. pg_snprintf() and UNIX libraries discussion a week ago? Which one shall I implement? Regards, Nicolai Tufar ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Current CVS parallel test lock
Michael Fuhr [EMAIL PROTECTED] writes: Do you have a way to revert to the old installation to check whether the checks fail again? It might be useful to track down exactly what happened. It seems wrong that a currently-installed version should have adverse effects on a just-built version's regression tests. We've seen that happen many times with shared library dependencies ... but libpq hasn't changed in the last few days AFAIR. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Current CVS parallel test lock
Do you have a way to revert to the old installation to check whether the checks fail again? It might be useful to track down exactly what happened. It seems wrong that a currently-installed version should have adverse effects on a just-built version's regression tests. We've seen that happen many times with shared library dependencies ... but libpq hasn't changed in the last few days AFAIR. I hadn't done a gmake install for months due to being overseas... Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression
From what I recall from the conversation, I would say rename the vsprintf and the sprintf functions in postgres to pq_vsnprintf and pq_snprintf. Define a couple macros: (in some common header, pqprintf.h?) #define snprintf pq_snprintf #define vsnprintf pq_snprintf Then just maintain the postgres forms of printf which have seemed to be OK except that on Win32 vnsprintf, although in the same object file was not being used. Dear all, After struggling for one week to to integrate FreeBSD's vfprintf.c into PostgreSQL I finally gave up. It is too dependent on underlying FreeBSD system functions. To incorporate it into PostgreSQL we need to move vfprintf.c file itself, two dozen files form gdtoa and a half a dozen __XXtoa.c files scattered in apparently random fashion all around FreeBSD source tree. Instead I researched some other implementations of snprintf on the web released under a license compatible with PostgreSQL's. The most suitable one I have come upon is Trio [http://daniel.haxx.se/projects/trio/]. It is distributed under a MIT-like license which, I think will be compatible with us. What do you think about it? Shall I abandon FreeBSD and go ahead ıncorporatıng Trıo? And by the way, what ıs the conclusıon of snprıntf() vs. pg_snprintf() and UNIX libraries discussion a week ago? Which one shall I implement? Regards, Nicolai Tufar ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] fool-toleranced optimizer
Neil Conway [EMAIL PROTECTED] writes: In any case, when this problem does occur, it is obvious to the user that something is wrong, and no harm is done. I don't see why you say that. The whole complaint here is that it's *not* obvious something is wrong and there *is* damage until it's realized. If I run a query like this on a busy database backing a web site it could easily kill the web site. Or if I start this query and expect it to take an hour then after 2-3 hours when I finally get suspicious I've just wasted 2-3 hours... Or if I add it to the list of nightly jobs I could lose all the other jobs that night that are preempted by this heavy query running for too long. Given a complex SQL query, it might take a bit of examination to determine which join clause is missing -- but the proper way to fix that is better query visualization tools (perhaps similar RH's Visual Explain, for example). This would solve the general problem: the user didn't write the query they intended to write, rather than a very narrow subset (the user forgot a join clause and accidentally computed a cartesian product). I'm unconvinced any tool can make humans infallible. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] One vacuum full is not enough.
Ühel kenal päeval (teisipäev, 8. märts 2005, 00:52+0100), kirjutas Gaetano Mendola: Hi all, running a 7.4.5 it happen to me with another table where a single vacuum full was not freeing enough pages, here the verbose vacuum full, as you can see only at the end: truncated 8504 to 621 pages. I use pg_autovacuum and it's not enough. I'll schedule again a nightly vacuum full. You may have too few fsm pages, so new inserts/updates don't use all the pages freed by vacuums. -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] pgpool question
I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] pgpool question
Oleg Bartunov wrote: I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? I don't think anything. I could be wrong, but my understanding is that if the primary goes down, you have to restart pgpool after primary comes back up. It doesn't toggle back and forth from primary - secondary when necessary, it only goes primary-secondary. I played with pgpool for a while and came up with effectively the same confused question. -- Jeff Hoffmann [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pgpool question
On Wed, 9 Mar 2005, Jeff Hoffmann wrote: Oleg Bartunov wrote: I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? I don't think anything. I could be wrong, but my understanding is that if the primary goes down, you have to restart pgpool after primary comes back up. It doesn't toggle back and forth from primary - secondary when necessary, it only goes primary-secondary. I played with pgpool for a while and came up with effectively the same confused question. Seems, limited functionality. But, then I don't understand switchover options ([-s {m[aster]|s[econdary]] switch). What's '-s m switch' for ? Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] fool-toleranced optimizer
Simon, Neil, all: IMHO calling this a foolishly written query is completely arbitrary. I can imagine plenty of applications for which a cartesian join makes sense. In this case the user didn't write the query they meant to write -- but it is surely hopeless to prevent that in the general case :) Hey, this reminds me, it's about time for us to set ADD_MISSING_FROM=FALSE as the default, for 8.1, yes? When we added the option in 7.4, it was with the expectation of changing the default. The reason this is relevant is Missing FROM Clause is a frequent cause of cartesian joins, because a user mixed up their table aliases. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pgpool question
Oleg Bartunov wrote: On Wed, 9 Mar 2005, Jeff Hoffmann wrote: Oleg Bartunov wrote: I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? I don't think anything. I could be wrong, but my understanding is that if the primary goes down, you have to restart pgpool after primary comes back up. It doesn't toggle back and forth from primary - secondary when necessary, it only goes primary-secondary. I played with pgpool for a while and came up with effectively the same confused question. Seems, limited functionality. But, then I don't understand switchover options ([-s {m[aster]|s[econdary]] switch). What's '-s m switch' for ? That was exactly my question so I sent a message on the pgpool mailing list. I was trying to set up a managed downtime system where I would switch from master to secondary, update the master, switch back to master update the secondary. My plan was to use the -s switch to do that, but I could only switch from master to secondary, I couldn't switch back to master using that switch. I was told that I'd have to restart pgpool to get back to the master, which is effectively what your question was about. I'm assuming the same mechanism is at work in both cases. -- Jeff Hoffmann [EMAIL PROTECTED] ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] Runtime accepting build discrepancies
Scenario: A user download a pre-built PostgreSQL 7.4.7 from somewhere and a pre-built pljava distro from gborg. He gets everything running but suddenly encounteres problems with the timetz type. PL/Java apparently return the time as zero. The problem is caused by the postgresql binary being built using --enable-integer-datetimes whereas the PL/Java binary is not. The dynamic loader doesn't detect this and I bet there's a ton of combinations that will link just fine but perhaps crash (badly) in runtime. I would like to detect discrepancies like this during runtime somehow. I feel that it's either that or stop providing pre-built binaries altogether. I realize that I can't be the only one with this problem. How is this normally handled? Regards, Thomas Hallgren ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Runtime accepting build discrepancies
Thomas Hallgren [EMAIL PROTECTED] writes: A user download a pre-built PostgreSQL 7.4.7 from somewhere and a pre-built pljava distro from gborg. He gets everything running but suddenly encounteres problems with the timetz type. PL/Java apparently return the time as zero. The problem is caused by the postgresql binary being built using --enable-integer-datetimes whereas the PL/Java binary is not. Why is PL/Java dependent on the internal representation of any particular datatype? Seems like this is a symptom of bad PL design more than anything else. The dynamic loader doesn't detect this and I bet there's a ton of combinations that will link just fine but perhaps crash (badly) in runtime. I would like to detect discrepancies like this during runtime somehow. I feel that it's either that or stop providing pre-built binaries altogether. I realize that I can't be the only one with this problem. How is this normally handled? If you want you can look into pg_control to see how the database is set up. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Runtime accepting build discrepancies
Tom Lane wrote: Why is PL/Java dependent on the internal representation of any particular datatype? Seems like this is a symptom of bad PL design more than anything else. I didn't see any other way of doing it short of using string conversions. That doesn't seem very optimal. Java's internal representation of time is millisecs so I have code in place that looks like this (t in this case is a TimeADT): #ifdef HAVE_INT64_Time mSecs = t / 1000;/* Convert to millisecs */ if(tzAdjust) mSecs += Timestamp_getCurrentTimeZone() * 1000;/* Adjust from local time to UTC */ #else if(tzAdjust) t += Timestamp_getCurrentTimeZone();/* Adjust from local time to UTC */ t *= 1000.0;/* Convert to millisecs */ mSecs = (jlong)floor(t); #endif I'm of course interested in improving it. Especially if you consider this bad PL design. What do you suggest I do instead? The dynamic loader doesn't detect this and I bet there's a ton of combinations that will link just fine but perhaps crash (badly) in runtime. I would like to detect discrepancies like this during runtime somehow. I feel that it's either that or stop providing pre-built binaries altogether. I realize that I can't be the only one with this problem. How is this normally handled? If you want you can look into pg_control to see how the database is set up. That would cover this. Thanks (I'd still appreciate an alternative suggestion on the above though). Regards, Thomas Hallgren ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] BUG #1528: Rows returned that should be excluded by WHERE clause
I wrote: I think the problem can be expressed as regression=# select 2 as id, max(b) from t2 having 2 = 1; id | max +- 2 | (1 row) the issue is clearly that the known-false HAVING clause is pushed down inside the aggregation, as though it were WHERE. The existing code pushes down HAVING to WHERE if the clause contains no aggregates, but evidently this is too simplistic. What are the correct conditions for pushing down HAVING clauses to WHERE? After reading the spec a little, I think that we have oversimplified our handling of aggregate-free HAVING clauses. If you look in planner.c you'll find that such a clause is converted into a WHERE clause, but this is not what the spec says to do, and you can tell the difference in cases like the above. What the spec actually says, or at least implies, is that a HAVING clause is to be evaluated only once per group --- where the group is the whole table if there's no GROUP BY clause. The group is to be discarded if the HAVING clause doesn't return true. SQL92 7.8: 1) Let T be the result of the preceding from clause, where clause, or group by clause. If that clause is not a group by clause, then T consists of a single group and does not have a grouping column. 2) The search condition is applied to each group of T. The result of the having clause is a grouped table of those groups of T for which the result of the search condition is true. So it's clear that what the above case should return is a grouped table having no groups ... ie, no rows out. What we are actually returning is one group containing no rows, which is visibly different because of the presence of the aggregate function in the SELECT list. There are really four cases to think about, depending on whether the query has GROUP BY and on whether it has any aggregates outside the HAVING clause: 1. No GROUP BY, no aggregates Per spec, the HAVING clause should be evaluated once and either we return the whole input or none of it. Since there are no grouped columns and (by assumption) no aggregates in the HAVING clause, the HAVING clause must in fact be variable-free, ie, it's a pseudoconstant clause. (Only pseudoconstant, because it might contain outer-level variables or volatile functions.) I think the correct implementation in this case is to generate a gating Result node with the HAVING clause as a one-time filter, so that we don't evaluate any of the query if the HAVING is false. The current code gets this almost right: it will make a variable-free WHERE clause into a Result gating condition *if it contains no volatile functions*. So it's wrong for the volatile function case but right otherwise. 2. GROUP BY, no aggregates In this case the HAVING clause might contain references to the grouping columns. It is legitimate to push down the HAVING to become WHERE, but *only* if it doesn't contain any volatile functions --- otherwise it might be possible to tell that the HAVING clause was executed more than once. It would be useful to push down the HAVING if, for example, it could become an indexscan qualifier. However if the HAVING condition is expensive to compute (eg it contains a subselect) we'd probably be better off not to push it into WHERE, but to arrange to evaluate it only once per group. Right now the executor cannot support testing such a condition, but I think it would be easy enough to improve nodeGroup.c to allow testing a qual condition for each group. 3. No GROUP BY, has aggregates As in case 1, the HAVING clause must be variable-free, so the best implementation would be to put it into a gating Result node. It would be correct to treat it the same way as we do for a HAVING clause containing aggregates (ie, attach it as a qual condition to the Agg plan node) --- but that would mean computing and throwing away the aggregate result when the HAVING fails, when we could skip computing it altogether. 4. GROUP BY and has aggregates This is really the same as case 2: we could push down the HAVING condition if it contains no volatile functions, but unless it is cheap to evaluate we are probably best off to attach it as a qual condition to the Agg node, ie, evaluate it only once per group. The only difference is that we don't need an executor fix to support this, since Agg does quals already. So, aside from the originally reported bug, there are two other problems in this logic: it isn't ensuring that volatile functions will be evaluated only once per group, and it isn't considering evaluation cost in deciding whether a clause that could be converted to WHERE should be or not. I haven't yet tried to make a patch that fixes all of these things. It'll likely come out complex enough that we don't want to back-patch it into 8.0 or before. If so, I'll try to make a simpler variant that fixes the semantic bugs but doesn't try to be smart about evaluation cost. Comments?
Re: [HACKERS] fool-toleranced optimizer
Neil Conway wrote: Simon Riggs wrote: Oleg is saying that the optimizer doesn't protect against foolish SQL requests. His query is an example of a foolishly written query. IMHO calling this a foolishly written query is completely arbitrary. I can imagine plenty of applications for which a cartesian join makes sense. In this case the user didn't write the query they meant to write -- but it is surely hopeless to prevent that in the general case :) Sure, but this case, at least, is (hopefully) easily detectable (as such things go), has a high cost when it occurs, and is *usually* not what the user intended. Hence, it makes sense to go ahead and run the query, but issue a warning at the very beginning, e.g. WARNING: query JOINs tables list of tables without otherwise referencing or making use of those tables. This may cause excessively poor performance of the query. That said, the real question is whether or not it's worth putting in the effort to detect this condition and issue the warning. I'd say probably not, but if this is a big enough itch for someone then why should we discourage them from coding up a fix? -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pgpool question
I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? That's an intended behavior. Or at least a side effect of failover design. If we allow unlimited switching between the master and the secondary, pgpool could repeat switching forever if we have unliable network or hardware. However it would be easy to modify pgpool to allow automatic switch back (with a risk of unwanted repeating switching, of course). Is this what you want? -- Tatsuo Ishii ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] could not read, could not write, could not fsync, Windows 2000, PostgreSQL 8.0.1
I encounter the similar problem in make check (win2k-mingw, 8.0.1). The regression test could randomly fail due to could not write block (Invalid argument) problem or could not remove file problem. Regards, Qingqing p.s. I believe this could be a potential serious problem, so I forward it to pgsql.hackers. --- Jean-Pierre Pelletier [EMAIL PROTECTED] We are running PostgreSQL 8.0.1 since last week and have these messages in our PostgreSQL log file: 2005-02-10 10:27:19 FATAL: could not read block 38 of relation 1663/17269/16676: Invalid argument 2005-02-10 10:27:19 FATAL: could not read block 46 of relation 1663/17269/16676: Invalid argument 2005-02-10 10:27:19 FATAL: could not read block 50 of relation 1663/17269/16676: Invalid argument 16676 is table pgdepend 2005-02-14 12:19:46 FATAL: could not read block 7 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 20 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 22 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 14 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 18 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 24 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 8 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 19 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 11 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 21 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 25 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 23 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 13 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 9 of relation 1663/17269/1247: Invalid argument 2005-02-14 12:19:46 FATAL: could not read block 12 of relation 1663/17269/1247: Invalid argument 1247 is table pgtype 2005-02-16 10:48:26 ERROR: could not write block 61 of relation 1663/17269/16676: Invalid argument 2005-02-16 10:48:26 CONTEXT: writing block 61 of relation 1663/17269/16676 16676 is table pgdepend 2005-02-16 12:47:03 ERROR: could not write block 3 of relation 1663/17269/1614690: Invalid argument 2005-02-16 12:47:03 CONTEXT: writing block 3 of relation 1663/17269/1614690 We couldn't find what 1614690 is? 2005-02-18 05:32:06 LOG: could not fsync segment 0 of relation 1663/17269/1677179: Permission denied 2005-02-18 05:32:06 ERROR: storage sync failed on magnetic disk: Permission denied ... 2005-02-18 07:58:28 ERROR: storage sync failed on magnetic disk: Permission denied 2005-02-18 07:58:29 LOG: could not fsync segment 0 of relation 1663/17269/1677179: Permission denied These two messages are repeated every seconds for almost 2.5 hours Again, we couldn't find what 1677179 is? We are on Windows 2000 Server, Service Pack 4 and were successfully running PostgreSQL 7.4.1 before that. We have done a vacuum, analyze and reindex on pgdepend and pgtype and restarted PostgreSQL a few times, we had no problems doing that but the error messages are still there. Is this normal and if not, how do we fix that? Thanks Jean-Pierre Pelletier p.s.: We also have messages FATAL: could not read from statistics collector pipe approx. twice an hour. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] We are not following the spec for HAVING without GROUP BY
I wrote in reference to bug#1528: What the spec actually says, or at least implies, is that a HAVING clause is to be evaluated only once per group --- where the group is the whole table if there's no GROUP BY clause. In fact, reading the spec more closely, it is clear that the presence of HAVING turns the query into a grouped query even if there is no GROUP BY. I quote SQL92 7.8 again: 7.8 having clause Function Specify a grouped table derived by the elimination of groups from ^^^ the result of the previously specified clause that do not meet the search condition. ... 1) Let T be the result of the preceding from clause, where clause, or group by clause. If that clause is not a group by clause, then T consists of a single group and does not have a grouping column. 2) The search condition is applied to each group of T. The result of the having clause is a grouped table of those groups of T ^^ for which the result of the search condition is true. This is quite clear that the output of a HAVING clause is a grouped table no matter whether the query uses GROUP BY or aggregates or not. What that means is that neither the HAVING clause nor the targetlist can use any ungrouped columns except within aggregate calls; that is, select col from tab having 21 is in fact illegal per SQL spec, because col isn't a grouping column (there are no grouping columns in this query). What we are currently doing with this construct is pretending that it means select col from tab where 21 but it does not mean that according to the spec. As I look into this, I find that several warty special cases in the parser and planner arise from our misunderstanding of this point, and could be eliminated if we enforced the spec's interpretation. In particular this whole business of moving HAVING into WHERE is wrong and should go away. Comments? Can anyone confirm whether DB2 or other databases allow ungrouped column references with HAVING? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pgpool question
On Mar 9, 2005, at 7:25 PM, Tatsuo Ishii wrote: That's an intended behavior. Or at least a side effect of failover design. If we allow unlimited switching between the master and the secondary, pgpool could repeat switching forever if we have unliable network or hardware. I didn't really think of it that way, I had just expected it to toggle back and forth for some reason. At first I thought it was just me, but apparently Oleg got the same impression as I did. After you explained it, though, it makes sense why someone would want it to work that way. However it would be easy to modify pgpool to allow automatic switch back (with a risk of unwanted repeating switching, of course). Is this what you want? How about making it a switch at run-time? Like --cycle for the automatic fail-over toggling. It seems that there are valid reasons for both options. What makes the most sense to me would be to make the -s switch always be able to switch to the server specified in the command line or toggle between the two if you don't specify either master or secondary. That way an administrator can always have control or which server is being used then either leave the automatic behavior as is or create a cycle switch in case the user preferred that behavior. -- Jeff Hoffmann [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY
Tom Lane wrote: What that means is that neither the HAVING clause nor the targetlist can use any ungrouped columns except within aggregate calls; that is, select col from tab having 21 is in fact illegal per SQL spec, because col isn't a grouping column (there are no grouping columns in this query). [...] Comments? Can anyone confirm whether DB2 or other databases allow ungrouped column references with HAVING? Oracle does not allow such references. It issues ORA-00979: not a GROUP BY expression when you try to hand it such a reference. MS SQL Server does not allow such references either, yielding columnname is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.. Can't comment about DB2. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY
Tom Lane [EMAIL PROTECTED] writes: In particular this whole business of moving HAVING into WHERE is wrong and should go away. It sort of seems like select aggregate(col) from tab with no GROUP BY clause is a bit of a special case. The consistent thing to do would be to return no records. It's only due to the special case that SQL returns a single record for this case. It seems like this special case is the only way to expose this difference between a WHERE clause and a HAVING clause with an aggregate-free expression. It seems like all that's needed is a simple flag on the Aggregate node that says whether to output a single record if there are no input records or to output no records. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression tests
Nicolai Tufar wrote: Dear all, After struggling for one week to to integrate FreeBSD's vfprintf.c into PostgreSQL I finally gave up. It is too dependent on underlying FreeBSD system functions. To incorporate it into PostgreSQL we need to move vfprintf.c file itself, two dozen files form gdtoa and a half a dozen __XXtoa.c files scattered in apparently random fashion all around FreeBSD source tree. Instead I researched some other implementations of snprintf on the web released under a license compatible with PostgreSQL's. The most suitable one I have come upon is Trio [http://daniel.haxx.se/projects/trio/]. It is distributed under a MIT-like license which, I think will be compatible with us. What do you think about it? Shall I abandon FreeBSD and go ahead ?ncorporat?ng Tr?o? Yes, maybe just add the proper %$ handling from Trio to what we have now. And by the way, what ?s the conclus?on of snpr?ntf() vs. pg_snprintf() and UNIX libraries discussion a week ago? Which one shall I implement? I think the proper direction is not to export snprintf() from libpq so that user applications will use the native snprintf, but our code can use our custom version. I will work on a patch now. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] fool-toleranced optimizer
Kevin Brown [EMAIL PROTECTED] writes: Hence, it makes sense to go ahead and run the query, but issue a warning at the very beginning, e.g. WARNING: query JOINs tables list of tables without otherwise referencing or making use of those tables. This may cause excessively poor performance of the query. Well the problem with a warning is what if it *is* intentional? It's not ok to fill my logs up with warnings for every time the query is executed. That just forces me to turn off warnings. It would be ok to have an option to block cartesian joins entirely. I might even choose to run with that enabled normally. I can always disable it for queries I know need cartesion joins. But outputing a warning and then continuing on to destroy performance just gets the worst of both worlds. For that matter, I wonder whether it's time to consider an option to disable implicit (ie, pre-ansi join syntax) joins entirely. It seems like lots of shops are likely imposing coding standards that require ansi join syntax anyways. In environments like that you would expect a CROSS JOIN b not just select * from a,b anyways. Shops like that might appreciate the ability to enforce a blanket coding standard on that point and get protection from accidental cartesian joins as a side benefit. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] We are not following the spec for HAVING without GROUP
Comments? Can anyone confirm whether DB2 or other databases allow ungrouped column references with HAVING? Oracle does not allow such references. It issues ORA-00979: not a GROUP BY expression when you try to hand it such a reference. MS SQL Server does not allow such references either, yielding columnname is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.. Can't comment about DB2. MySQL allows it: mysql create table tab (col integer); Query OK, 0 rows affected (0.01 sec) mysql select col from tab having 2 1; Empty set (0.00 sec) mysql insert into tab values (1); Query OK, 1 row affected (0.00 sec) mysql select col from tab having 2 1; +--+ | col | +--+ |1 | +--+ 1 row in set (0.00 sec) Of course, that's not saying much! Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Information schema tweak?
The current _pg_keypositions function generates the numbers from 1 to 32 using a massive union, shouldn't it just use generate_series instead (to make it faster/simpler)? Chris ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] pgpool question
On Thu, 10 Mar 2005, Tatsuo Ishii wrote: I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? That's an intended behavior. Or at least a side effect of failover design. If we allow unlimited switching between the master and the secondary, pgpool could repeat switching forever if we have unliable network or hardware. You may recognize manual action of DBA, that is why '-s m switch' is needed ! However it would be easy to modify pgpool to allow automatic switch back (with a risk of unwanted repeating switching, of course). Is this what you want? No, your arguments are important. I don't want too much intelligence, but when I restart primary backend by hand I don't see any problem to switch pgpool back using -s m switch (also by hand). For brave, probably, something like --auto would fine. -- Tatsuo Ishii Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pgpool question
On Thu, 10 Mar 2005, Tatsuo Ishii wrote: I'm experimenting with pgpool 2.51 on my Linux box runnung two postgresql backends: pg74:5432 and pg801:5433 I configured pgpool to use pg74:5432 as primary backend and pg801:5433 as second one. Pgpool is running on default port () and I configured my web application to use it, so I could start/stop backends without disturbing client (web browser). When I stop primary backend (pg74:5432) pgpool switched to backend failover from (5432) to (5433) done but when I start primary and stopped secondary backend pgpool never switched back to primary backend as expected ! I see bogus message like: starting failover from (5433) to (5433) What I'm doing wrong ? That's an intended behavior. Or at least a side effect of failover design. If we allow unlimited switching between the master and the secondary, pgpool could repeat switching forever if we have unliable network or hardware. You may recognize manual action of DBA, that is why '-s m switch' is needed ! However it would be easy to modify pgpool to allow automatic switch back (with a risk of unwanted repeating switching, of course). Is this what you want? No, your arguments are important. I don't want too much intelligence, but when I restart primary backend by hand I don't see any problem to switch pgpool back using -s m switch (also by hand). For brave, probably, something like --auto would fine. Problem is, the communication infrastructure between pgpool master process and another pgpool process which you started by hand to switch is very poor. Currently signal is used to tell the pgpool master process to switch/fail over. Apparently we need more info (for example the signal is sent by DBA, not because of a failure detected by a pgpool child process). To fix the problem we need shared memory, pipe, socket or whatever to pass info and it will not be a trivial fix so will take some time... -- Tatsuo Ishii ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pgpool question
On Thu, 10 Mar 2005, Tatsuo Ishii wrote: However it would be easy to modify pgpool to allow automatic switch back (with a risk of unwanted repeating switching, of course). Is this what you want? No, your arguments are important. I don't want too much intelligence, but when I restart primary backend by hand I don't see any problem to switch pgpool back using -s m switch (also by hand). For brave, probably, something like --auto would fine. Problem is, the communication infrastructure between pgpool master process and another pgpool process which you started by hand to switch is very poor. Currently signal is used to tell the pgpool master process to switch/fail over. Apparently we need more info (for example the signal is sent by DBA, not because of a failure detected by a pgpool child process). To fix the problem we need shared memory, pipe, socket or whatever to pass info and it will not be a trivial fix so will take some time... may be telnet interface ? -- Tatsuo Ishii Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] We are not following the spec for HAVING without GROUP
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Comments? Can anyone confirm whether DB2 or other databases allow ungrouped column references with HAVING? MySQL allows it: A slightly tighter experiment shows that they treat HAVING like WHERE in this case: mysql create table tab(col int); Query OK, 0 rows affected (0.00 sec) mysql insert into tab values(1); Query OK, 1 row affected (0.00 sec) mysql insert into tab values(2); Query OK, 1 row affected (0.01 sec) mysql select col from tab having col1; +--+ | col | +--+ |2 | +--+ 1 row in set (0.00 sec) I think it's fairly likely that they copied our misinterpretation ... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Information schema tweak?
Christopher Kings-Lynne [EMAIL PROTECTED] writes: The current _pg_keypositions function generates the numbers from 1 to 32 using a massive union, Only for very small values of current ... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY
Greg Stark [EMAIL PROTECTED] writes: It sort of seems like select aggregate(col) from tab with no GROUP BY clause is a bit of a special case. The consistent thing to do would be to return no records. I don't think so. SQL99 defines this stuff in a way that might make you feel better: it says that the presence of either HAVING or any aggregate functions in the target list implies GROUP BY (), which is the case that they identify as grand total in the group by clause syntax. Basically this legitimizes the concept of turning the whole input table into one group, which is what's really going on here. We get this right in the case where it's driven by the appearance of aggregate functions, but not when it's just driven by HAVING. It seems like all that's needed is a simple flag on the Aggregate node that says whether to output a single record if there are no input records or to output no records. The implementation problem is that there *is* no aggregate node if there are no aggregates. The definitional problem is that we are allowing cases that are illegal per spec and are going to be difficult to continue to support if we handle all the spec-required cases properly. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] We are not following the spec for HAVING without GROUP
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, March 09, 2005 8:45 PM To: Christopher Kings-Lynne Cc: Kevin Brown; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] We are not following the spec for HAVING without GROUP Christopher Kings-Lynne [EMAIL PROTECTED] writes: Comments? Can anyone confirm whether DB2 or other databases allow ungrouped column references with HAVING? DB2 does not like it. This runs and returns data: SELECT INFO5FILES.APAMT.DEBAMT FROM INFO5FILES.APAMT SELECT statement run complete. This fails to prepare: SELECT INFO5FILES.APAMT.DEBAMT FROM INFO5FILES.APAMT having 1 2 Column DEBAMT or function specified in SELECT list not valid. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Information schema tweak?
Only for very small values of current ... Argh! Forgot it was a 7.4 server :) Oops. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] NIST Test Suite
Are we able to run more NIST tests now? http://www.itl.nist.gov/div897/ctg/sql_form.htm Chris ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] NIST Test Suite
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Are we able to run more NIST tests now? http://www.itl.nist.gov/div897/ctg/sql_form.htm I thought we'd extracted all the interesting juice from the NIST tests a couple years ago. Specifically I recall this fix came out of NIST testing done by Red Hat: 2003-06-06 11:04 tgl Implement outer-level aggregates to conform to the SQL spec, with extensions to support our historical behavior. An aggregate belongs to the closest query level of any of the variables in its argument, or the current query level if there are no variables (e.g., COUNT(*)). The implementation involves adding an agglevelsup field to Aggref, and treating outer aggregates like outer variables at planning time. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] NIST Test Suite
Christopher Kings-Lynne [EMAIL PROTECTED] writes: I thought we'd extracted all the interesting juice from the NIST tests a couple years ago. I was just chatting with Neil C on IRC and he mentioned that back when they were using it at RedHat, PostgreSQL didn't have schemas so most stuff failed. I think he's remembering an earlier iteration. The mid-2003 patch I cited is well past 7.3's release date, so it was surely inspired by testing with a schema-aware PG. My own recollection is that the other stuff we found in that test cycle had to do with locale/collation features (eg per-column collation options), and so there's no point in retesting until some progress is made on that front ... regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] NIST Test Suite
I thought we'd extracted all the interesting juice from the NIST tests a couple years ago. Specifically I recall this fix came out of NIST testing done by Red Hat: 2003-06-06 11:04 tgl Implement outer-level aggregates to conform to the SQL spec, with extensions to support our historical behavior. An aggregate belongs to the closest query level of any of the variables in its argument, or the current query level if there are no variables (e.g., COUNT(*)). The implementation involves adding an agglevelsup field to Aggref, and treating outer aggregates like outer variables at planning time. I was just chatting with Neil C on IRC and he mentioned that back when they were using it at RedHat, PostgreSQL didn't have schemas so most stuff failed. Was just wondering if that's true and we need to re-run them or something. Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster