Re: [HACKERS] Patch for Win32 blocking problem
time. I find the proposed patch in pgwin32_waitforsinglesocket to be a pretty ugly kluge though. Are you sure it's needed given the other fix? Loop in pgwin32_send() doesn't prevent from infinite sleeping in WaitForMultipleObjectEx in pgwin32_waitforsinglesocket. I'm not a Windows guru at all, and I'm not like that part of patch too. I can't find better solution... May be that way (untested): if ( isUDP && (what & FP_WRITE) ) for(;;) { r = WaitForMultipleObjects(100 ms); if ( r == WAIT_TIMEOUT ) { r == WSASend( sero packet ); /* see comments in pgwin32_select() */ [ analyze result of WSASend: * if success then return 1 * WSAEWOULDBLOCK - continue loop * SOCKET_ERROR - return 0 ] } else break; } I'm not sure that is more clean way... -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] GROUP BY on a large table -- an idea
Recently I've been playing with quite a big table (over 50mln rows), and did some SELECT ... sum(...) WHERE ... GROUP BY ... queries. The usual plan for these is to sort the entries according to GROUP BY specification, then to run aggregates one by one. If the data to be sorted is large enough, PostgreSQL has no other option than to spill to disk, which well, Isn't the fastest... Then I thought, why not skip the sorting, and do something like this, say a table is: kind tetx, sumkind text, cnt int, size int foo, bar, 2, 10 blah, argh, 23, 3 foo, baz, 1, 20 blah, argh, 23, 3 and the query would be: SELECT kind,subkind,sum(cnt),sum(size) FROM x GROUP BY kind,subkind; Instead of sorting, we would create an empty temporary state variable tree, looked up "foo, bar" in that tree -- if not found, enter there a new initialized state variables for sum(cnt) and sum(size). looked up blah, argh -- create the state variables looked up foo, baz -- create the state variables looked up blah,argh -- update the state variables there. And finally dump the whole tree as results of our query: foo, bar, 2, 10 foo, baz, 1, 20 blah, argh, 46,6 Of course first thing you'll notice is that the "looking up" part will probably eat all benefits from not spilling, and if group by columns have large cardinality we'd have to spill anyway. But then I thought, maybe a hybrid approach could be benefitial, and its' the resason I'm sending this message. The hybrid approach means: sort as much as you can without spilling to disk, then aggregate and store aggregate state variables in safe place (like a "tree" above), get more tuples from the table, sort them, update aggregate state variables, lather, rince, repeat. This should avoid the need to spill to disk. The cost of such operation depends on cardinality of GROUP BY part (and their correlation, doh), so it might be wise to try this approach for promising data only. I have yet almost no knowledge od PostgreSQL's internals, but I think the idea is feasible therefore I post it here. If it's been proposed before, forgive me. Regards, Dawid ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Hints WAS: Index Tuning Features
On Wed, 2006-10-11 at 19:18 -0400, Mark Woodward wrote: > > > > Since you're the one who wants hints, that's kind of up to you to define. > > Write a specification and make a proposal. > > > > What is the point of writing a proposal if there is a threat of "will be > rejected" if one of the people who would do the rejection doesn't at least > outline what would be acceptable? The general theme of other discussions has been that the best approach is to provide additional information in a general declarative form. Further details on that have not yet been proposed. A hint touches a single SQL statement, so decorating 1000s of statements with exact tips about what to do is both time consuming and eventually inaccurate. Yet after all that work, the planner still doesn't know why you thought the hint was the right thing to do and so the 1001st query will perform poorly. AFAICS hints are a legacy code compatibility issue, not something truly desirable in the long run. Once you introduce them you must honour them across 10+ years of releases and then you remove any chance of improved optimisations speeding up applications in the future. Support for such tricks is possibly a different issue from encouraging their use; if we did support them I would welcome the day when enable_hints = off is the default and would discourage their general use where possible. We may be following other products in some ways, so that gives us an opportunity to learn from both the useful lessons and the mistakes. Deciding which is which is the hard part, IMHO. The *right* place, IMHO, for planner information is to decorate the tables, columns and relationships so that *every* SQL statement can pick that up. If the world changes, you make one change and all your SQL benefits. As the analyzers improve, you may be able to just remove those declarations entirely but generally I imagine the DB designer will for many years know things that cannot be determined by an analyzer. Some might say this is a EndUserDeveloper v DBA v InternalsHacker issue and I might agree, but would side with the DBAs on this. I'm not aware of any research specifically in that area - though I know many proposals have been made for various kinds of learning optimizer. Thats dandy, but you'll still need an infrastructure to support what has been learned and use it to override the more general analyzer info. So a manual declarative approach seems like the first step in that direction. So, I'm interested to hear various possible declarative approaches and will assist where I can with that. /*+ we might be able to use some special functions to do this, rather than more SQL */ -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] GROUP BY on a large table -- an idea
On Thu, Oct 12, 2006 at 09:52:11AM +0200, Dawid Kuroczko wrote: > Recently I've been playing with quite a big table (over 50mln rows), > and did some SELECT ... sum(...) WHERE ... GROUP BY ... queries. > > The usual plan for these is to sort the entries according to GROUP BY > specification, then to run aggregates one by one. If the data to be > sorted is large enough, PostgreSQL has no other option than to spill > to disk, which well, Isn't the fastest... Sounds an awful lot like the HashAggregate nodetype which has existed since at least 7.4. It has a hashtable of "keys" with attached "states". Hope this helps, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Patch for Win32 blocking problem
Attached patch implements that idea. May be that way (untested): if ( isUDP && (what & FP_WRITE) ) for(;;) { r = WaitForMultipleObjects(100 ms); if ( r == WAIT_TIMEOUT ) { r == WSASend( sero packet ); /* see comments in pgwin32_select() */ [ analyze result of WSASend: * if success then return 1 * WSAEWOULDBLOCK - continue loop * SOCKET_ERROR - return 0 ] } else break; } -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ win32_1.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Hints WAS: Index Tuning Features
> Clinging to sanity, [EMAIL PROTECTED] ("Mark Woodward") mumbled into > her beard: >> What is the point of writing a proposal if there is a threat of >> "will be rejected" if one of the people who would do the rejection >> doesn't at least outline what would be acceptable? > > If your proposal is merely "let's do something like Oracle," it should > be obvious why that would be rejected. There is considerable legal > danger to slavish emulation. Further, since PostgreSQL isn't Oracle, > slavish emulation wouldn't work anyways. I don't actually like Oracle's hinting system. > > If a proposal is too fuzzy to be considered a source of a > specification, it should be obvious that that would be rejected. Well, "fuzzy" isn't a bad starting place to start gathering information for an eventual proposal. > > If you have an idea clear enough to turn into a meaningful proposal, > put it in for the usual "to and fro"; that generally leads to enormous > improvements. Absolutely. > > I'm not sure what a good hinting system ought to look like; what I > *do* know is that a fuzzy proposal won't be much good. That is sort of the stopping block. None of us "know" what it should look like, but leaving the topic as "if you want it, go do the work and submit a patch." Isn't going to get it done. First we should decide if it is, in fact, something that ought to happen, then if that happens, we should think about what it should be. Again, what would be the point of writing a proposal if there is *no* concensus on what would be acceptible? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] On status data and summaries
On Wed, Oct 11, 2006 at 06:26:50PM -0400, Bruce Momjian wrote: > > Funny, sounds like what I usually do. I welcome the assistance. Well, yes, that was my impression too. The complaint in the thread that started all this, as I understood it, was that there were big, hairy features that tended to have long discussions about them, and very few people among even the committers seemed to have a clear idea of exactly where things stood at the end of coding. But I take Jim Nasby's point, that the request for monitoring isn't going to come. How about an alternative: _you_ delegate threads/features/whatever to me to watch? Would that help? (I don't care how we do it, so long as it would be helpful and so long as it's wanted.) A -- Andrew Sullivan | [EMAIL PROTECTED] Unfortunately reformatting the Internet is a little more painful than reformatting your hard drive when it gets out of whack. --Scott Morris ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Hints (was: [HACKERS] Index Tuning Features)
On Wed, Oct 11, 2006 at 03:08:42PM -0700, Ron Mayer wrote: > Is one example is the table of addresses clustered by zip-code > and indexes on State, City, County, etc? No. > Now I'm not saying that a more advanced statistics system > couldn't one-day be written that sees these patterns in the > data -- but it doesn't seem likely in the near term. DBA-based > hints could be a useful interim work-around. Some others in the hints thread seem to be suggesting additional ways of teaching the optimiser what to do. _That_ seems to me to be a good idea (but I don't think that qualifies as what people usually think of as hints). A sufficiently general system of hints sprinkled on the SQL is a lot of work, and doesn't seem to me to be a whole lot easier than working out how to make second-order relationship discovery (of the sort you're talking about) cheaper and automatic. Certainly, there's plenty of statistics math kicking around that allows one to discover such relationships, and they have the benefit of not being by definition a way to work around the optimiser. A -- Andrew Sullivan | [EMAIL PROTECTED] Users never remark, "Wow, this software may be buggy and hard to use, but at least there is a lot of code underneath." --Damien Katz ---(end of broadcast)--- TIP 6: explain analyze is your friend
Hints (Was: [HACKERS] Index Tuning Features)
On Thu, Oct 12, 2006 at 08:34:45AM +0200, Florian Weimer wrote: > > Some statistics are very hard to gather from a sample, e.g. the number > of distinct values in a column. Then how can the DBA know it, either? The problem with this sort of argument is always that people are claiming some special knowledge is available to the DBA. If it's true that the DBA really _can_ know this stuff, then there must be some way to learn it. Which means that you can, in principle, figure out ways to communicate that to the optimizer. I like the suggestion, though, that there be ways to codify known relationships in the system in such a way that the optimizer can learn to use that information. _That_ seems to me to be a big improvement, because it can be taken into consideration along with relationships that emerge from the statistics, that the DBA may not know about. A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] create temp table .. on commit delete rows
1) # create temp table a ( a int ) without oids on commit delete rows; # insert into a values(1); # begin; # insert into a values(2); # commit; # select * from a; a --- (0 rows) 2) # insert into a values(1); # begin; # insert into a values(2); # rollback; # select * from a; a --- (0 rows) It seems to me that 1) is good, but 2) makes some strange, unpredictable result... -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] create temp table .. on commit delete rows
Teodor Sigaev wrote: 2) # insert into a values(1); You're running in auto-commit, mode. An implicit commit happens after this statement. Which clears the table. # begin; # insert into a values(2); # rollback; # select * from a; a --- (0 rows) It seems to me that 1) is good, but 2) makes some strange, unpredictable result... Looks right to me. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] create temp table .. on commit delete rows
You're running in auto-commit, mode. An implicit commit happens after this statement. Which clears the table. > Looks right to me. Oops, I see -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Hints WAS: Index Tuning Features
Simon Riggs <[EMAIL PROTECTED]> writes: > The *right* place, IMHO, for planner information is to decorate the > tables, columns and relationships so that *every* SQL statement can pick > that up. If the world changes, you make one change and all your SQL > benefits. As the analyzers improve, you may be able to just remove those > declarations entirely but generally I imagine the DB designer will for > many years know things that cannot be determined by an analyzer. Not to say this isn't a good idea -- i think it's a great idea. But note that it doesn't solve some of the use cases of hints. Consider something like: WHERE NOT radius_authenticate(suspected_hacker) or WHERE NOT verify_pk_signature(document_we_have_no_reason_to_doubt) There's no way you can decorate the radius_authenticate or verify_pk_signature functions with any hint that would know when you're using it on a value you expect it to fail or succeed on. In some cases you pass data you expect to succeed 99.9% of the time and in others data you expect to fail. Only the author of the query knows what kind of value he's passing and how selective the resulting expression is. And while people seem to be worried about OLTP queries this is one area where I actually think of DSS queries first. OLTP queries run usually relatively simple and get optimized well. Also OLTP queries only have to be "fast enough", not optimal. So the planner usually does anm adequate job. DSS queries are often dozens of lines of plan -- this is where enable_* is insufficient to test the query and it's where the planner often goes wrong. And it's where an incremental speed difference can make a big difference with a report that takes 8 hours or 4 hours. Often these queries are ad-hoc queries that never will be run again anyways. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Hints WAS: Index Tuning Features
On Thu, Oct 12, 2006 at 08:50:04AM -0400, Greg Stark wrote: > Not to say this isn't a good idea -- i think it's a great idea. But note that > it doesn't solve some of the use cases of hints. Consider something like: > > WHERE NOT radius_authenticate(suspected_hacker) > > or > > WHERE NOT verify_pk_signature(document_we_have_no_reason_to_doubt) We currently construct histograms for data in columns, there's no particular reason why we can't do the same for functions. In a similar vein, I don't see a reason why you couldn't enable a stats-gathering mode where function calls would be instrumented to collect information about: - time of execution - distribution of outputs Which could then be used by the planner. Or more directly: CREATE HISTOGRAM FOR FUNCTION verify_pk_signature(documenent) AS ( true = 99, false = 1 ); (Perhaps DECLARE is the better phrase?). Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] On status data and summaries
On Thu, Oct 12, 2006 at 07:14:36AM -0400, Andrew Sullivan wrote: > On Wed, Oct 11, 2006 at 06:26:50PM -0400, Bruce Momjian wrote: > > > > Funny, sounds like what I usually do. I welcome the assistance. > > Well, yes, that was my impression too. The complaint in the thread > that started all this, as I understood it, was that there were big, > hairy features that tended to have long discussions about them, and > very few people among even the committers seemed to have a clear idea > of exactly where things stood at the end of coding. Something else that would be helpful is summarizing discussions that don't result in code (perhaps on the developer wiki). That way if someone wants to see the history of something they don't have to wade through the list archives just to have some idea of what's being talked about. This is probably especially important when the discussion results in some design ideas/proposals but never moves forward from there. > But I take Jim Nasby's point, that the request for monitoring isn't > going to come. How about an alternative: _you_ delegate > threads/features/whatever to me to watch? Would that help? (I don't > care how we do it, so long as it would be helpful and so long as it's > wanted.) I'd be happy to help as well. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Hints WAS: Index Tuning Features
Mark Woodward wrote: > > Exactly. IMHO, it is a frustrating environment. PostgreSQL is a great > system, and while I completely respect the individuals involved, I think > the "management" for lack of a better term, is difficult. 'course you're welcome to fork the project as well if your style and/or priorities are different than the postgresql core team's. If your approach is that much less frustrating, your project would gain that much more momentum from developers joining you. If more developers like your style and/or priorities, they'll migrate to your project. I think Bizgres, Mammoth, EnterpriseDB and RedHat DB and Gentoo's-occasional-bizzaro-patches are both proofs that it can work as well as proofs that it's difficult. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Hints WAS: Index Tuning Features
On Thu, 2006-10-12 at 15:06 +0200, Martijn van Oosterhout wrote: > On Thu, Oct 12, 2006 at 08:50:04AM -0400, Greg Stark wrote: > > Not to say this isn't a good idea -- i think it's a great idea. But note > > that > > it doesn't solve some of the use cases of hints. Consider something like: > > > > WHERE NOT radius_authenticate(suspected_hacker) > > > > or > > > > WHERE NOT verify_pk_signature(document_we_have_no_reason_to_doubt) > > We currently construct histograms for data in columns, there's no > particular reason why we can't do the same for functions. In a similar > vein, I don't see a reason why you couldn't enable a stats-gathering > mode where function calls would be instrumented to collect information > about: > > - time of execution > - distribution of outputs > > Which could then be used by the planner. Or more directly: > > CREATE HISTOGRAM FOR FUNCTION verify_pk_signature(documenent) > AS ( true = 99, false = 1 ); > > (Perhaps DECLARE is the better phrase?). The CREATE OPERATOR command already has a RESTRICT=res_proc clause which provides the ability to attach selectivity functions onto an operator. So this is already possible if you turn radius_authenticate() into an operator. The function parameters are passed to the selectivity function, so you can use that to steer the selectivity. Perhaps this should be allowed on the CREATE FUNCTION command when a procedure returns boolean. Greg is right though, there are some times when the default selectivity won't match what we know to be the case. His example of a function which might normally be expected to return 99.9% true being used to evaluate a list of suspected attempts where the return might well be 20% true is a good one. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Subject: problem with using O_DIRECT
On Thu, Oct 12, 2006 at 12:19:07AM -0400, Ye Qin wrote: > I tried to use O_DIRECT on Linux (SuSe) Kernel 2.6, but failed to make it > run. > For example, if I added the option in the "open" of BasicOpenFile(), > I got the following error after typing "psql -l", > > psql: could not connect to server: Connection refused > Is the server running locally and accepting > connections on Unix domain socket "/tmp/.s.PGSQL.5432"? That only shows that the server's not running, which doesn't tell us much. Are there errors in the server log? Did it dump core? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Database Auditing
On 10/12/06, Marco Serantoni <[EMAIL PROTECTED]> wrote: >> I'm evaluating of use postgresql but for local law requirements is >> needed for the access of some kind of data (sensitive) a log of the >> accesses (Auditing) is a feature available in many databases but i've >> seen that lacks in PostgreSQL, there are already plans to implement it >> or patches already submitted ? >> If not both could someone give me some hints on how do it > we can probably come up with something. can you please give specific > requirements about what type of information you have to keep track of? username, date and statement executed and optionally the IP. [moving this discussion to -general] please direct responses to that list only] have you looked at postgreql.conf? you can log all of those things and much more into the postgresql log. There is also built in rotation and retention policies. most especially, log_statement = 'all' now for fancy stuff, like logging of application data or other things like that, you can do many things with triggers. It is possible to write triggers that are pretty generic which can do that type of thing. merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Hints WAS: Index Tuning Features
On Thu, Oct 12, 2006 at 02:25:29PM +0100, Simon Riggs wrote: > The CREATE OPERATOR command already has a RESTRICT=res_proc clause which > provides the ability to attach selectivity functions onto an operator. > > So this is already possible if you turn radius_authenticate() into an > operator. The function parameters are passed to the selectivity > function, so you can use that to steer the selectivity. > > Perhaps this should be allowed on the CREATE FUNCTION command when a > procedure returns boolean. Why limit it to booleans? For many functions you can get a reasonable estimate of the resulting data by feeding the keys of the histogram through the function. If you know how the data in "field" is distributed, you can take a good guess at the distribution of upper(field). > Greg is right though, there are some times when the default selectivity > won't match what we know to be the case. His example of a function which > might normally be expected to return 99.9% true being used to evaluate a > list of suspected attempts where the return might well be 20% true is a > good one. In the extreme case you could drop the histogram in a transaction, but I can see use-case for declaring a histogram for the current session only, or even having profile to select from. I don't think annotating the query itself is a particularly good idea. The hard part is stoing the histograms and getting the planner to use them, once that happens the really is trivial. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] On status data and summaries
Jim C. Nasby wrote: > On Thu, Oct 12, 2006 at 07:14:36AM -0400, Andrew Sullivan wrote: > > On Wed, Oct 11, 2006 at 06:26:50PM -0400, Bruce Momjian wrote: > > > > > > Funny, sounds like what I usually do. I welcome the assistance. > > > > Well, yes, that was my impression too. The complaint in the thread > > that started all this, as I understood it, was that there were big, > > hairy features that tended to have long discussions about them, and > > very few people among even the committers seemed to have a clear idea > > of exactly where things stood at the end of coding. > > Something else that would be helpful is summarizing discussions that > don't result in code (perhaps on the developer wiki). That way if > someone wants to see the history of something they don't have to wade > through the list archives just to have some idea of what's being talked > about. This is probably especially important when the discussion results > in some design ideas/proposals but never moves forward from there. What I started to do for this is to add the thread URL to the TODO item it relates to. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] On status data and summaries
Andrew Sullivan wrote: > On Wed, Oct 11, 2006 at 06:26:50PM -0400, Bruce Momjian wrote: > > > > Funny, sounds like what I usually do. I welcome the assistance. > > Well, yes, that was my impression too. The complaint in the thread > that started all this, as I understood it, was that there were big, > hairy features that tended to have long discussions about them, and > very few people among even the committers seemed to have a clear idea > of exactly where things stood at the end of coding. Yep. I think Tom and I have a clear picture, but we aren't make it visible enough, I guess. One idea I had was to either create a web page or add to the top of the TODO items that are currently being worked on. > But I take Jim Nasby's point, that the request for monitoring isn't > going to come. How about an alternative: _you_ delegate > threads/features/whatever to me to watch? Would that help? (I don't > care how we do it, so long as it would be helpful and so long as it's > wanted.) I do think we need a structure for this to be valuable. We can perhaps use a wiki to track open development items, with some status, like I did for the open items list for 8.2. I usually only do that during feature freeze, but could expand it and open it up for others to edit. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Replication documentation
FYI, I have started working on a replication section for our 8.2 documentation. I will post a draft copy as soon as I finish. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] warning: "min" redefined of qsort.
"Hiroshi Saito" <[EMAIL PROTECTED]> writes: > I have warning with MinGW > qsort.c:53:1: warning: "min" redefined I've fixed this by using Min() from c.h instead. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] warning: "min" redefined of qsort.
Hi Tom-san. From: "Tom Lane" "Hiroshi Saito" <[EMAIL PROTECTED]> writes: I have warning with MinGW qsort.c:53:1: warning: "min" redefined I've fixed this by using Min() from c.h instead. Ahh, I was consideration shortage. Thanks!! Regards, Hiroshi Saito ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Getting the type Oid in a CREATE TYPE output function ..
I'm trying to create a few new types, and based on the type in/out functions will operate a bit differently. For the input function finding the type Oid is easy - Oid our_type_oid = PG_GETARG_OID(1); For output though I'm having difficulty finding out the type Oid. I've tried using getBaseType, get_rel_type_id, and get_typ_typrelid. Maybe I'm using the options wrong? Or not looking in the right place? I'm only interested in getting it working on 8.1 or greater, so if it changes in older versions I'm not as concerned. Any help is appreciated. Weslee ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
While investigating Merlin's bug report here: http://archives.postgresql.org/pgsql-general/2006-10/msg00447.php I realized that we've completely failed to consider the interactions of $subject. In particular, functions.c still thinks that SELECT is the only type of query that can return rows. ISTM that ideally, a query with RETURNING ought to act like a SELECT for the purposes of a SQL function --- to wit, that the result rows are discarded if it's not the last query in the function, and are returned as the function result if it is. The difficulty with this is that unlike SELECT, a query with RETURNING might be queueing up AFTER triggers, which we shouldn't fire until the query is fully executed. Merlin's report shows that we've already got a problem in the back branches with mishandling of after-trigger state, because we push an AfterTrigger stack level at start of an SQL function command, and then are willing to return from the function with that stack level still active if it's a set-returning function. I think we can fix this in the back branches by the expedient of not pushing a stack level (ie, not calling AfterTriggerBegin/EndQuery) unless it's a non-SELECT command --- SELECT will never queue triggers, and we never return partway through a non-SELECT command. But this falls down for RETURNING queries. I thought about fixing this by extending the AfterTrigger state structure to let it be a tree rather than just a stack, ie, we could temporarily pop the function AfterTrigger status entry without executing any queued triggers, and then push it back on when re-entering the function. This seems horribly messy however, and I'm not sure we could still promise unsurprising order of trigger execution in complicated cases. I think the most promising answer may be to push RETURNING rows into a TupleStore and then read them out from there, which is pretty much the same approach we adopted for RETURNING queries inside portals. This'd allow the query to be executed completely, and its triggers fired, before we return from the SQL function. Comments? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Hints WAS: Index Tuning Features
Mark, That is sort of the stopping block. None of us "know" what it should look like, but leaving the topic as "if you want it, go do the work and submit a patch." Isn't going to get it done. First we should decide if it is, in fact, something that ought to happen, then if that happens, we should think about what it should be. Well, that's what the *rest* of us are doing on the two threads ... here, and "Simple Join Optimized Badly" on performance. You're the only one who seems to want others to do the specification work for him. Start making suggestions, and stop criticizing the process. And, to give you a starting point: the discussion has morphed into: "What manual ways can we come up with for the DBA to influence the planner and fix planner "bugs" which won't have the fragility of query-based hints ala Oracle?" --Josh Berkus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Getting the type Oid in a CREATE TYPE output function ..
Weslee Bilodeau <[EMAIL PROTECTED]> writes: > I'm trying to create a few new types, and based on the type in/out > functions will operate a bit differently. > For the input function finding the type Oid is easy - > Oid our_type_oid = PG_GETARG_OID(1); > For output though I'm having difficulty finding out the type Oid. You can't, and if you could, relying on it would be a security hole in your function (somebody could invoke the function manually and pass it a false OID value). You have to put everything you need to know right into the Datum. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] Hints proposal
[ This is off-topic for -performance, please continue the thread in -hackers ] "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > These hints would outright force the planner to do things a certain way. > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */ This proposal seems to deliberately ignore every point that has been made *against* doing things that way. It doesn't separate the hints from the queries, it doesn't focus on fixing the statistical or cost misestimates that are at the heart of the issue, and it takes no account of the problem of hints being obsoleted by system improvements. > It would also be useful to allow tweaking of planner cost estimates. > This would take the general form of > node operator value This is at least focusing on the right sort of thing, although I still find it completely misguided to be attaching hints like this to individual queries. What I would like to see is information *stored in a system catalog* that affects the planner's cost estimates. As an example, the DBA might know that a particular table is touched sufficiently often that it's likely to remain RAM-resident, in which case reducing the page fetch cost estimates for just that table would make sense. (BTW, this is something the planner could in principle know, but we're unlikely to do it anytime soon, for a number of reasons including a desire for plan stability.) The other general category of thing I think we need is a way to override selectivity estimates for particular forms of WHERE clauses. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Hints proposal
On 10/12/06, Tom Lane <[EMAIL PROTECTED]> wrote: [ This is off-topic for -performance, please continue the thread in -hackers ] This proposal seems to deliberately ignore every point that has been made *against* doing things that way. It doesn't separate the hints from the queries, it doesn't focus on fixing the statistical or cost misestimates that are at the heart of the issue, and it takes no account of the problem of hints being obsoleted by system improvements. what about extending the domain system so that we can put in ranges that override the statistics or (imo much more importantly) provide information when the planner would have to restort to a guess. my case for this is prepared statements with a parameterized limit clause. prepare foo(l int) as select * from bar limit $1; maybe: create domain foo_lmt as int hint 1; -- probably needs to be fleshed out prepare foo(l foolmt) as select * from bar limit $1; this says: "if you have to guess me, please use this" what I like about this over previous attempts to persuade you is the grammar changes are localized and also imo future proofed. planner can ignore the hints if they are not appropriate for the oparation. merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Hints proposal
On Thu, Oct 12, 2006 at 11:42:32AM -0400, Tom Lane wrote: > [ This is off-topic for -performance, please continue the thread in > -hackers ] > > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > These hints would outright force the planner to do things a certain way. > > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */ > > This proposal seems to deliberately ignore every point that has been > made *against* doing things that way. It doesn't separate the hints > from the queries, it doesn't focus on fixing the statistical or cost > misestimates that are at the heart of the issue, and it takes no account > of the problem of hints being obsoleted by system improvements. Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See also my comment below. > > It would also be useful to allow tweaking of planner cost estimates. > > This would take the general form of > > node operator value > > This is at least focusing on the right sort of thing, although I still > find it completely misguided to be attaching hints like this to > individual queries. Yes, but as I mentioned the idea here was to come up with something that is (hopefully) easy to define and implement. In other words, something that should be doable for 8.3. Because this proposal essentially amounts to limiting plans the planner will consider and tweaking it's cost estimates, I'm hoping that it should be (relatively) easy to implement. > What I would like to see is information *stored in a system catalog* > that affects the planner's cost estimates. As an example, the DBA might > know that a particular table is touched sufficiently often that it's > likely to remain RAM-resident, in which case reducing the page fetch > cost estimates for just that table would make sense. (BTW, this is > something the planner could in principle know, but we're unlikely to > do it anytime soon, for a number of reasons including a desire for plan > stability.) All this stuff is great and I would love to see it! But this is all so abstract that I'm doubtful this could make it into 8.4, let alone 8.3. Especially if we want a comprehensive system that will handle most/all cases. I don't know if we even have a list of all the cases we need to handle. > The other general category of thing I think we need is a > way to override selectivity estimates for particular forms of WHERE > clauses. I hadn't thought about that for hints, but it would be a good addition. I think the stats-tweaking model would work, but we'd probably want to allow "=" as well (which could go into the other stats tweaking hints as well). ... WHERE a = b /* SELECTIVITY {+|-|*|/|=} value */ -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] Hints proposal
OK, I just have to comment... "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > These hints would outright force the planner to do things a certain way. > > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */ > > This proposal seems to deliberately ignore every point that has been > made *against* doing things that way. It doesn't separate the hints > from the queries, it doesn't focus on fixing the statistical or cost > misestimates that are at the heart of the issue, and it takes no account > of the problem of hints being obsoleted by system improvements. But whatever arguments you made about planner improvements and the like, it will NEVER be possible to correctly estimate in all cases the statistics for a query, even if you perfectly know WHAT statistics you need, which is also not the case all the time. Tom, you're the one who knows best how the planner works... can you bet anything you care about on the fact that one day the planner will never ever generate a catastrophic plan without DBA tweaking ? And how far in time we'll get to that point ? Until that point is achieved, the above proposal is one of the simplest to understand for the tweaking DBA, and the fastest to deploy when faced with catastrophic plans. And I would guess it is one of the simplest to be implemented and probably not very high maintenance either, although this is just a guess. If I could hint some of my queries, I would enable anonymous prepared statements to take into account the parameter values, but I can't because that results in runaway queries every now and then, so I had to force postgres generate generic queries without knowing anything about parameter values... so the effect for me is an overall slower postgres system because I couldn't fix the particular problems I had and had to tweak general settings. And when I have a problem I can't wait until the planner is fixed, I have to solve it immediately... the current means to do that are suboptimal. The argument that planner hints would hide problems from being solved is a fallacy. To put a hint in place almost the same amount of analysis is needed from the DBA as solving the problem now, so users who ask now for help will further do it even in the presence of hints. The ones who wouldn't are not coming for help now either, they know their way out of the problems... and the ones who still report a shortcoming of the planner will do it with hints too. I would even say it would be an added benefit, cause then you could really see how well a specific plan will do without having the planner capable to generate alone that plan... so knowledgeable users could come to you further down the road when they know where the planner is wrong, saving you time. I must say it again, this kind of query-level hinting would be the easiest to understand for the developers... there are many trial-end-error type of programmers out there, if you got a hint wrong, you fix it and move on, doesn't need to be perfect, it just have to be good enough. I heavily doubt that postgres will get bad publicity because user Joe sot himself in the foot by using bad hints... the probability for that is low, you must actively put those hints there, and if you take the time to do that then you're not the average Joe, and probably not so lazy either, and if you're putting random hints, then you would probably mess it up some other way anyway. And the thing about missing new features is also not very founded. If I would want to exclude a full table scan on a specific table for a specific query, than that's about for sure that I want to do that regardless what new features postgres will offer in the future. Picking one specific access method is more prone to missing new access methods, but even then, when I upgrade the DB server to a new version, I usually have enough other compatibility problems (till now I always had some on every upgrade I had) that making a round of upgrading hints is not an outstanding problem. And if the application works good enough with suboptimal plans, why would I even take that extra effort ? I guess the angle is: I, as a practicing DBA would like to be able to experiment and get most out of the imperfect tool I have, and you, the developers, want to make the tool perfect... I don't care about perfect tools, it just have to do the job... hints or anything else, if I can make it work GOOD ENOUGH, it's all fine. And hints is something I would understand and be able to use. Thanks for your patience if you're still reading this... Cheers, Csaba. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New version of money type
On Thu, 28 Sep 2006 23:23:30 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > > The existing type is depricated and has been since at least 8.1; so yes, > > it's slated for removal. > > Well, my perception of that has always been "it needs to be upgraded or > removed". So if D'Arcy wants to work on the improvement angle, I have > no problem with him doing so. The thing we need to negotiate is "how > much improvement is needed to keep it in core". Well, the patch I submitted is definitely an improvement over the existing version. Are you saying that I have to make further improvements before these ones can be imported? ISTM that going to 64 bit without any other change is big enough to warrant the change as is. Once that is done I would be happy to work on other improvements but my experience tells me not to make more than one major change at a time. The one issue I have with my existing patch though is the removal of the currency symbol from the output. There have been many suggestions that that just gets in the way but, following up on my own statement above, this is two changes, not one, and perhaps should be left out of the patch for that reason. -- D'Arcy J.M. Cain | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] Hints proposal
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > Yes, but as I mentioned the idea here was to come up with something that > is (hopefully) easy to define and implement. In other words, something > that should be doable for 8.3. Sorry, but that is not anywhere on my list of criteria for an important feature. Having to live with a quick-and-dirty design for the foreseeable future is an ugly prospect --- and anything that puts hints into application code is going to lock us down to supporting it forever. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] Hints proposal
Jim, These hints would outright force the planner to do things a certain way. ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */ This proposal seems to deliberately ignore every point that has been made *against* doing things that way. It doesn't separate the hints from the queries, it doesn't focus on fixing the statistical or cost misestimates that are at the heart of the issue, and it takes no account of the problem of hints being obsoleted by system improvements. Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See also my comment below. I don't see how adding extra tags to queries is easier to implement than an ability to modify the system catalogs. Quite the opposite, really. And, as I said, if you're going to push for a feature that will be obsolesced in one version, then you're going to have a really rocky row to hoe. Yes, but as I mentioned the idea here was to come up with something that is (hopefully) easy to define and implement. In other words, something that should be doable for 8.3. Because this proposal essentially amounts to limiting plans the planner will consider and tweaking it's cost estimates, I'm hoping that it should be (relatively) easy to implement. Even I, the chief marketing geek, am more concerned with getting a feature that we will still be proud of in 5 years than getting one in the next nine months. Keep your pants on! I actually think the way to attack this issue is to discuss the kinds of errors the planner makes, and what tweaks we could do to correct them. Here's the ones I'm aware of: -- Incorrect selectivity of WHERE clause -- Incorrect selectivity of JOIN -- Wrong estimate of rows returned from SRF -- Incorrect cost estimate for index use Can you think of any others? I also feel that a tenet of the design of the "planner tweaks" system ought to be that the tweaks are collectible and analyzable in some form. This would allow DBAs to mail in their tweaks to -performance or -hackers, and then allow us to continue improving the planner. --Josh Berkus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Getting the type Oid in a CREATE TYPE output function
Tom Lane wrote: > Weslee Bilodeau <[EMAIL PROTECTED]> writes: >> I'm trying to create a few new types, and based on the type in/out >> functions will operate a bit differently. >> For the input function finding the type Oid is easy - >> Oid our_type_oid = PG_GETARG_OID(1); >> For output though I'm having difficulty finding out the type Oid. > > You can't, and if you could, relying on it would be a security hole > in your function (somebody could invoke the function manually and pass > it a false OID value). You have to put everything you need to know > right into the Datum. I'm not as worried about them running it manually, since I want it to operate something like - select output_function( 'test'::text ); It would have the type OID for text. select output_function( 'test'::varchar ); It would have the type OID for varchar. I don't want them to tell me the OID they want, I just want to know what type the function was called with. Was it called as a varchar, text, my own type, bytea, etc ? Is this possible? A bit of what I'm trying to do - I'm creating an encrypted data type wrapped around pgcrypto. create table test ( test enctype ); insert into test values ( 'encrypt_me' ); The value in input is encrypted, then stored using byteain. The key used to encrypt it is based of the type. So I can use the same functions for 10 different CREATE TYPE statements. The output function descrypts the value, then hands it off to byteaout. It works perfectly so long as I used the same key for all my custom types. When I want a different key for each type though (so for example, encrypt credit cards with one key, addresses with another, etc) I need a way to tell them apart. The long way around is just create a new function for each type, but that seems messy since at least input can tell what type the input Datum is. Was hoping output can figure out the Datum type so I can decrypt it. Basically - create table test ( card enctype_card, addrress enctype_address ); Both types have different encryption keys. I know its best to encrypt in the application, and they can log the SQL on the server, or if your not using SSL it can be read, etc. Can't change the application to encrypt or use pgcrypto directly. Weslee ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] Hints proposal
Csaba, I guess the angle is: I, as a practicing DBA would like to be able to experiment and get most out of the imperfect tool I have, and you, the developers, want to make the tool perfect... I don't care about perfect tools, it just have to do the job... hints or anything else, if I can make it work GOOD ENOUGH, it's all fine. And hints is something I would understand and be able to use. Hmmm, if you already understand Visual Basic syntax, should we support that too? Or maybe we should support MySQL's use of '-00-00' as the "zero" date because people "understand" that? We're just not going to adopt a bad design because Oracle DBAs are used to it. If we wanted to do that, we could shut down the project and join a proprietary DB staff. The current discussion is: a) Planner tweaking is sometimes necessary; b) Oracle HINTS are a bad design for planner tweaking; c) Can we come up with a good design for planner tweaking? So, how about suggestions for a good design? --Josh Berkus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Getting the type Oid in a CREATE TYPE output function
Weslee Bilodeau <[EMAIL PROTECTED]> writes: > It works perfectly so long as I used the same key for all my custom > types. When I want a different key for each type though (so for example, > encrypt credit cards with one key, addresses with another, etc) I need a > way to tell them apart. [ shrug... ] Seems like you should be putting the key ID into the stored encrypted datums, then. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New version of money type
"D'Arcy J.M. Cain" writes: > Tom Lane <[EMAIL PROTECTED]> wrote: >> Well, my perception of that has always been "it needs to be upgraded or >> removed". So if D'Arcy wants to work on the improvement angle, I have >> no problem with him doing so. The thing we need to negotiate is "how >> much improvement is needed to keep it in core". > Well, the patch I submitted is definitely an improvement over the > existing version. Are you saying that I have to make further > improvements before these ones can be imported? I didn't say that. I was responding to someone whose position seemed to be "money is going to be removed, therefore you shouldn't work on it". I wanted to know exactly what would need to be fixed before they'd not want it removed. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Hints proposal
On Thu, Oct 12, 2006 at 11:25:25AM -0500, Jim C. Nasby wrote: > Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See > also my comment below. If I may argue in the other direction, speaking as one whose career (if we may be generous enough to call it that) has been pretty much exclusively on the operations end of things, I think that's an awful idea. There are two ways that quick-fix solve-the-problem-now hints are going to be used. One is in the sort of one-off query that a DBA has to run from time to time, that takes a long time, but that isn't really a part of regular application load. The thing is, if you already know your data well enough to provide a useful hint, you also know your data well enough to work around the problem in the short run (with some temp table tricks and the like). The _other_ way it's going to be used is as a stealthy alteration to regular behaviour, to solve a particular nasty performance problem that happens to result on a given day. And every single time I've seen anything like that done, the long term effect is always monstrous. Two releases later, all your testing and careful inspection and planning goes to naught one Saturday night at 3 am (because we all know computers know what time it is _where you are_) when the one-off trick that you pulled last quarter to solve the manager's promise (which was made while out golfing, so nobody wrote anything down) turns out to have a nasty effect now that the data distribution is different. Or you think so. But now you're not sure, because the code was tweaked a little to take some advantage of something you now have because of the query plans that you ended up getting because of the hint that was there because of the golf game, so now if you start fiddling with the hints, maybe you break something else. And you're tired, but the client is on the phone from Hong King _right now_. The second case is, from my experience, exactly the sort of thing you want really a lot when the golf game is just over, and the sort of thing you end up kicking yourself for in run-on sentences in the middle of the night six months after the golf game is long since forgotten. The idea for knobs on the planner that allows the DBA to give directed feedback, from which new planner enhancements can also come, seems to me a really good idea. But any sort of quick and dirty hint for right now gives me the willies. A -- Andrew Sullivan | [EMAIL PROTECTED] "The year's penultimate month" is not in truth a good way of saying November. --H.W. Fowler ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] On status data and summaries
On Thu, Oct 12, 2006 at 10:20:43AM -0400, Bruce Momjian wrote: > use a wiki to track open development items, with some status, like I did > for the open items list for 8.2. I usually only do that during feature > freeze, but could expand it and open it up for others to edit. So do I understand this as a suggestion to pick some threads, keep track of them, but otherwise shut up until feature freeze? That's ok with me, if that's what helps; but I was under the impression from the meta-discussion last time that people didn't think that was working. Anyone? A -- Andrew Sullivan | [EMAIL PROTECTED] If they don't do anything, we don't need their acronym. --Josh Hamilton, on the US FEMA ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New version of money type
On Thu, 12 Oct 2006 13:21:37 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > > Well, the patch I submitted is definitely an improvement over the > > existing version. Are you saying that I have to make further > > improvements before these ones can be imported? > > I didn't say that. I was responding to someone whose position seemed to > be "money is going to be removed, therefore you shouldn't work on it". > I wanted to know exactly what would need to be fixed before they'd not > want it removed. Cool. So what do I do with the patch? Should I add the currency symbol back in and commit or should I resubmit the patch to hackers for further review? -- D'Arcy J.M. Cain | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New version of money type
"D'Arcy J.M. Cain" writes: > Cool. So what do I do with the patch? Should I add the currency > symbol back in and commit or should I resubmit the patch to hackers for > further review? Well, one thing you definitely *don't* do is commit right now, because we're in feature freeze, not to mention trying to avoid forced initdbs now that beta has started. Sit on it till 8.3 is branched, and meanwhile think about what you want to do with the currency-symbol issue... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PERFORM] Hints proposal
On 10/12/06, Andrew Sullivan <[EMAIL PROTECTED]> wrote: On Thu, Oct 12, 2006 at 11:25:25AM -0500, Jim C. Nasby wrote: > Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See > also my comment below. If I may argue in the other direction, speaking as one whose career (if we may be generous enough to call it that) has been pretty much exclusively on the operations end of things, I think that's an awful idea. There are two ways that quick-fix solve-the-problem-now hints are going to be used. One is in the sort of one-off query that a DBA has third way: to solve the problem of data (especially constants) not being available to the planner at the time the plan was generated. this happens most often with prepared statements and sql udfs. note that changes to the plan generation mechanism (i think proposed by peter e a few weeks back) might also solve this. In a previous large project I had to keep bitmap scan and seqscan off all the time because of this problem (the project used a lot of prepared statements). or am i way off base here? merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New version of money type
On Thu, 12 Oct 2006 14:17:33 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > "D'Arcy J.M. Cain" writes: > > Cool. So what do I do with the patch? Should I add the currency > > symbol back in and commit or should I resubmit the patch to hackers for > > further review? > > Well, one thing you definitely *don't* do is commit right now, because > we're in feature freeze, not to mention trying to avoid forced initdbs > now that beta has started. Sit on it till 8.3 is branched, and OK. I hadn't thought of it as a new feature per se but I understand the initdb issue. Holding at 30,000 feet, ground control. > meanwhile think about what you want to do with the currency-symbol > issue... Personally I don't see a need for it but I am currently in favour of adding it back in before committing just so that we can deal with the issue separately. The same as the other changes being discussed. -- D'Arcy J.M. Cain | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
On Thu, Oct 12, 2006 at 12:19:24PM -0400, Tom Lane wrote: > I think the most promising answer may be to push RETURNING rows into a > TupleStore and then read them out from there, which is pretty much the > same approach we adopted for RETURNING queries inside portals. This'd > allow the query to be executed completely, and its triggers fired, > before we return from the SQL function. Would this only affect RETURNING queries that are returning data via a SRF? ISTM that pushing to a tuplestore is a lot of extra work for simpler cases like INSERT INTO table DELETE FROM queue_table WHERE ... RETURNING *; Even for the case of just going back to the client, it seems like a fair amount of overhead. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
On 10/12/06, Tom Lane <[EMAIL PROTECTED]> wrote: I think the most promising answer may be to push RETURNING rows into a TupleStore and then read them out from there, which is pretty much the same approach we adopted for RETURNING queries inside portals. It certainly sounds like the safest implementation and I can't think of any simple way around using a tuplestore in this type of case. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
On Thu, Oct 12, 2006 at 01:28:18PM -0500, Jim C. Nasby wrote: > On Thu, Oct 12, 2006 at 12:19:24PM -0400, Tom Lane wrote: > > I think the most promising answer may be to push RETURNING rows > > into a TupleStore and then read them out from there, which is > > pretty much the same approach we adopted for RETURNING queries > > inside portals. This'd allow the query to be executed completely, > > and its triggers fired, before we return from the SQL function. > > Would this only affect RETURNING queries that are returning data via > a SRF? ISTM that pushing to a tuplestore is a lot of extra work for > simpler cases like INSERT INTO table DELETE FROM queue_table WHERE > ... RETURNING *; Even for the case of just going back to the > client, it seems like a fair amount of overhead. More generally, would this change impede promoting RETURNING to work just as VALUES does in 8.2 (i.e. being able to join RETURNING results, etc.)? Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Thu, Oct 12, 2006 at 12:19:24PM -0400, Tom Lane wrote: >> I think the most promising answer may be to push RETURNING rows into a >> TupleStore and then read them out from there, which is pretty much the >> same approach we adopted for RETURNING queries inside portals. > Would this only affect RETURNING queries that are returning data via a > SRF? Right, and specifically an SQL-language function. > ISTM that pushing to a tuplestore is a lot of extra work for I'm not entirely convinced of that --- the overhead of getting down through functions.c and ExecutorRun into the per-tuple loop isn't trivial either. It wouldn't work at all without significant restructuring, in fact, because as execMain stands we'd be firing "per statement" triggers once per row if we tried to handle RETURNING queries the same way that SQL functions currently handle SELECTs. When you look at the big picture there's a whole lot of call overhead that would go away if SQL functions returned a tuplestore instead of a row at a time. I was toying with the idea that we should make SELECTs return via a tuplestore too, which would allow eliminating the special case of having ExecutorRun return an individual tuple at all. That's not a bugfix though so I'll wait for 8.3 before thinking more about it ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] On status data and summaries
On Thu, Oct 12, 2006 at 01:53:27PM -0400, Andrew Sullivan wrote: > On Thu, Oct 12, 2006 at 10:20:43AM -0400, Bruce Momjian wrote: > > use a wiki to track open development items, with some status, like I did > > for the open items list for 8.2. I usually only do that during feature > > freeze, but could expand it and open it up for others to edit. > > So do I understand this as a suggestion to pick some threads, keep > track of them, but otherwise shut up until feature freeze? That's > ok with me, if that's what helps; but I was under the impression from > the meta-discussion last time that people didn't think that was > working. Anyone? If the ball gets dropped on something we want to know well before feature-freeze. Something that might be useful would be to send out a monthly status report of all active development. That'd be pretty easy to do if there was a wiki with all the info available.. the trick would just be to *ahem* nudge people to update the status of what they're working on once a month. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
On Thu, Oct 12, 2006 at 02:50:34PM -0400, Tom Lane wrote: > > ISTM that pushing to a tuplestore is a lot of extra work for > > I'm not entirely convinced of that --- the overhead of getting down > through functions.c and ExecutorRun into the per-tuple loop isn't > trivial either. It wouldn't work at all without significant > restructuring, in fact, because as execMain stands we'd be firing "per > statement" triggers once per row if we tried to handle RETURNING queries > the same way that SQL functions currently handle SELECTs. When you look > at the big picture there's a whole lot of call overhead that would go > away if SQL functions returned a tuplestore instead of a row at a time. > I was toying with the idea that we should make SELECTs return via a > tuplestore too, which would allow eliminating the special case of having > ExecutorRun return an individual tuple at all. That's not a bugfix > though so I'll wait for 8.3 before thinking more about it ... The specific concern I have is large result sets, like 10s or 100s of MB (or more). We just added support for not buffering those in psql, so it seems like a step backwards to have the backend now buffering it (unless I'm confused on how a tuplestore works...) -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
David Fetter <[EMAIL PROTECTED]> writes: > More generally, would this change impede promoting RETURNING to work > just as VALUES does in 8.2 (i.e. being able to join RETURNING results, > etc.)? Making that happen would imply a whole lot of other changes; this issue isn't the principal gating factor. One of the main things I'd point to right now, in view of this having all arisen from the question of when triggers should fire, is where and when we'd fire BEFORE/AFTER STATEMENT triggers for a RETURNING command embedded in a larger query. For that matter, the system has several not-easily-removed assumptions that a SELECT command won't fire any triggers at all --- which would break down if we allowed constructs like SELECT ... FROM (INSERT ... RETURNING ...) ... We do currently have the ability to make plpgsql functions send RETURNING results back to a calling query, and with this change we could say the same of plain SQL functions --- and in both cases we'll be depending on a tuplestore buffer to keep things sane in terms of when triggers fire. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] create temp table .. on commit delete rows
On Thu, Oct 12, 2006 at 03:51:39PM +0400, Teodor Sigaev wrote: > >You're running in auto-commit, mode. An implicit commit happens > >after this statement. Which clears the table. Looks right to me. > > Oops, I see Should something notice and raise a warning when people create a TEMP table and have AUTOCOMMIT on? Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > The specific concern I have is large result sets, like 10s or 100s of MB > (or more). We just added support for not buffering those in psql, so it > seems like a step backwards to have the backend now buffering it (unless > I'm confused on how a tuplestore works...) Well, a tuplestore can dump to disk, so at least you don't need to worry about out-of-memory considerations. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
On Thu, Oct 12, 2006 at 03:03:43PM -0400, Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > The specific concern I have is large result sets, like 10s or 100s of MB > > (or more). We just added support for not buffering those in psql, so it > > seems like a step backwards to have the backend now buffering it (unless > > I'm confused on how a tuplestore works...) > > Well, a tuplestore can dump to disk, so at least you don't need to worry > about out-of-memory considerations. Sure, it's just a lot of data to be shuffling around if we can avoid it. Perhaps we could only do this if there's triggers on the table involved? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] create temp table .. on commit delete rows
On Thu, Oct 12, 2006 at 12:01:12PM -0700, David Fetter wrote: > On Thu, Oct 12, 2006 at 03:51:39PM +0400, Teodor Sigaev wrote: > > >You're running in auto-commit, mode. An implicit commit happens > > >after this statement. Which clears the table. Looks right to me. > > > > Oops, I see > > Should something notice and raise a warning when people create a TEMP > table and have AUTOCOMMIT on? Maybe if ON COMMIT is set to DELETE ROWS or DROP... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] On status data and summaries
On Thu, Oct 12, 2006 at 01:56:37PM -0500, Jim C. Nasby wrote: > *ahem* nudge people to update the status of what they're working on once > a month. Well, though, remember that the point of this was supposed to be to make things easier for the developers, who are already spending (it would seem) too many cycles keeping on top of this. Or maybe I just misunderstood what the problem was people were having. A -- Andrew Sullivan | [EMAIL PROTECTED] Users never remark, "Wow, this software may be buggy and hard to use, but at least there is a lot of code underneath." --Damien Katz ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Documentation fix for --with-ldap
Neil Conway wrote: > On Mon, 2006-09-04 at 10:23 +0200, Albe Laurenz wrote: > > This is just a 'one line' change in the documentation of > > the --with-ldap flag of ./configure > > Applied, thanks for the patch. > > (BTW, when trivial patches like this fall through the cracks, I'd > encourage patch submitters to resend them if you'd like to see them > applied more promptly.) Actually the patch was previously rejected. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > Sure, it's just a lot of data to be shuffling around if we can avoid it. > Perhaps we could only do this if there's triggers on the table involved? Maybe, but it's awfully late in the 8.2 cycle to be worrying about performance improvements for something that currently doesn't work at all. I'm inclined to keep it simple for now; we can revisit the issue later if anyone has problems in practice. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Documentation fix for --with-ldap
On Thu, 2006-10-12 at 21:11 +0200, Peter Eisentraut wrote: > Actually the patch was previously rejected. Oh? Sorry, I must have missed that. On what grounds was it rejected? -Neil ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] create temp table .. on commit delete rows
On Thu, Oct 12, 2006 at 02:07:28PM -0500, Jim C. Nasby wrote: > On Thu, Oct 12, 2006 at 12:01:12PM -0700, David Fetter wrote: > > On Thu, Oct 12, 2006 at 03:51:39PM +0400, Teodor Sigaev wrote: > > > >You're running in auto-commit, mode. An implicit commit > > > >happens after this statement. Which clears the table. Looks > > > >right to me. > > > > > > Oops, I see > > > > Should something notice and raise a warning when people create a > > TEMP table and have AUTOCOMMIT on? > > Maybe if ON COMMIT is set to DELETE ROWS or DROP... Sounds good :) Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] array_accum aggregate
Stephen Frost <[EMAIL PROTECTED]> writes: > * Neil Conway ([EMAIL PROTECTED]) wrote: >> There is no guarantee why SQL NULL and PG_RETURN_XYZ(NULL) refer to the >> same thing -- use PG_RETURN_NULL() to return a SQL NULL value, or just >> make the function strict. > Huh, alright. I'll probably just change it to PG_RETURN_NULL(). Unless the function actually *needs* to be non-strict, you should mark it strict and omit the runtime test for null input altogether. This is the general way that it's done in existing backend C functions. Doing it the other way is needlessly inconsistent (thus distracting readers) and clutters the code. (However, now that we support nulls in arrays, meseems a more consistent definition would be that it allows null inputs and just includes them in the output. So probably you do need it non-strict.) Personally though I'm much more concerned about the state datatype. As-is I think it's not only ugly but probably a security hole. If you are declaring the state type as something other than what it really is then you have to defend against two sorts of problems: someone being able to crash the database by calling your function and passing it something it didn't expect, or crashing the database by using your function to pass some other function an input it didn't expect. For example, since you've got aaccum_sfunc declared to return anyarray when it returns no such thing, something like array_out(aaccum_sfunc(...)) would trivially crash the backend. It's possible that the error check to insist on being called with an AggState context is a sufficient defense against that, but I feel nervous about it, and would much rather have a solution that isn't playing fast and loose with the type system. Particularly if it's going to go into core rather than contrib. I'm inclined to think that this example demonstrates a deficiency in the aggregate-function design: there should be a way to declare what we're really doing. But I don't know exactly what that should look like. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] Hints proposal
> Hmmm, if you already understand Visual Basic syntax, should we support > that too? Or maybe we should support MySQL's use of '-00-00' as the > "zero" date because people "understand" that? You completely misunderstood me... I have no idea about oracle hints, never used Oracle in fact. My company uses oracle, but I have only very very limited contact with oracle issues, and never touched a hint. I'm only talking about ease of use, learning curves, and complexity in general. While I do like the idea of an all automatic system optimizer which takes your query portofolio and analyzes the data based on those queries and creates you all the indexes you need and all that, that's not gonna happen soon, because it's a very complex thing to implement. The alternative is that you take your query portofolio, analyze it yourself, figure out what statistics you need, create indexes, tweak queries, hint the planner for correlations and stuff... which is a complex task, and if you have to tell the server about some correlations with the phase of the moon, you're screwed cause there will never be any DB engine which will understand that. But you always can put the corresponding hint in the query when you know the correlation is there... The problem is that the application sometimes really knows better than the server, when the correlations are not standard. > We're just not going to adopt a bad design because Oracle DBAs are used > to it. If we wanted to do that, we could shut down the project and > join a proprietary DB staff. I have really nothing to do with Oracle. I think you guys are simply too blinded by Oracle hate... I don't care about Oracle. > The current discussion is: > > a) Planner tweaking is sometimes necessary; > b) Oracle HINTS are a bad design for planner tweaking; While there are plenty of arguments you made against query level hints (can we not call them Oracle-hints ?), there are plenty of users of postgres who expressed they would like them. I guess they were tweaking postgres installations when they needed it, and not Oracle installations. I expressed it clearly that for me query level hinting would give more control and better understanding of what I have to do for the desired result. Perfect planning -> forget it, I only care about good enough with reasonable tuning effort. If I have to tweak statistics I will NEVER be sure postgres will not backfire on me again. On the other hand if I say never do a seq scan on this table for this query, I could be sure it won't... > c) Can we come up with a good design for planner tweaking? Angles again: good enough now is better for end users, but programmers always go for perfect tomorrow... pity. Cheers, Csaba. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Hints proposal
Csaba Nagy <[EMAIL PROTECTED]> writes: > Until that point is achieved, the above proposal is one of the simplest > to understand for the tweaking DBA, and the fastest to deploy when faced > with catastrophic plans. And I would guess it is one of the simplest to > be implemented and probably not very high maintenance either, although > this is just a guess. That guess is wrong ... but more to the point, if you think that "simple and easy to implement" should be the overriding concern for designing a new feature, see mysql. They've used that design approach for years and look what a mess they've got. This project has traditionally done things differently and I feel no need to change that mindset now. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] Hints proposal
On Thu, Oct 12, 2006 at 02:21:55PM -0400, Merlin Moncure wrote: > third way: to solve the problem of data (especially constants) not > being available to the planner at the time the plan was generated. > this happens most often with prepared statements and sql udfs. note > that changes to the plan generation mechanism (i think proposed by > peter e a few weeks back) might also solve this. You're right about this, but you also deliver the reason why we don't need hints for that: the plan generation mechanism is a better solution to that problem. It's this latter thing that I keep coming back to. As a user of PostgreSQL, the thing that I really like about it is its pragmatic emphasis on correctness. In my experience, it's a system that feels very UNIX-y: there's a willingness to accept "80/20" answers to a problem in the event you at least have a way to get the last 20, but the developers are opposed to anything that seems really kludgey. In the case you're talking about, it seems to me that addressing the problems where they come from is a better solution that trying to find some way to work around them. And most of the use-cases I hear for a statement-level hints system fall into this latter category. A -- Andrew Sullivan | [EMAIL PROTECTED] Unfortunately reformatting the Internet is a little more painful than reformatting your hard drive when it gets out of whack. --Scott Morris ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] Hints proposal
On Thu, Oct 12, 2006 at 09:40:30AM -0700, Josh Berkus wrote: > Jim, > > >>>These hints would outright force the planner to do things a certain way. > >>>... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */ > >>This proposal seems to deliberately ignore every point that has been > >>made *against* doing things that way. It doesn't separate the hints > >>from the queries, it doesn't focus on fixing the statistical or cost > >>misestimates that are at the heart of the issue, and it takes no account > >>of the problem of hints being obsoleted by system improvements. > > > >Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See > >also my comment below. > > I don't see how adding extra tags to queries is easier to implement than > an ability to modify the system catalogs. Quite the opposite, really. > > And, as I said, if you're going to push for a feature that will be > obsolesced in one version, then you're going to have a really rocky row > to hoe. Unless you've got a time machine or a team of coders in your back pocket, I don't see how the planner will suddenly become perfect in 8.4... > >Yes, but as I mentioned the idea here was to come up with something that > >is (hopefully) easy to define and implement. In other words, something > >that should be doable for 8.3. Because this proposal essentially amounts > >to limiting plans the planner will consider and tweaking it's cost > >estimates, I'm hoping that it should be (relatively) easy to implement. > > Even I, the chief marketing geek, am more concerned with getting a > feature that we will still be proud of in 5 years than getting one in > the next nine months. Keep your pants on! Hey, I wrote that email while dressed! :P We've been seeing the same kinds of problems that are very difficult (or impossible) to fix cropping up for literally years... it'd be really good to at least be able to force the planner to do the sane thing even if we don't have the manpower to fix it right now... > I actually think the way to attack this issue is to discuss the kinds of > errors the planner makes, and what tweaks we could do to correct them. > Here's the ones I'm aware of: > > -- Incorrect selectivity of WHERE clause > -- Incorrect selectivity of JOIN > -- Wrong estimate of rows returned from SRF > -- Incorrect cost estimate for index use > > Can you think of any others? There's a range of correlations where the planner will incorrectly choose a seqscan over an indexscan. Function problems aren't limited to SRFs... we have 0 statistics ability for functions. There's the whole issue of multi-column statistics. > I also feel that a tenet of the design of the "planner tweaks" system > ought to be that the tweaks are collectible and analyzable in some form. > This would allow DBAs to mail in their tweaks to -performance or > -hackers, and then allow us to continue improving the planner. Well, one nice thing about the per-query method is you can post before and after EXPLAIN ANALYZE along with the hints. But yes, as we move towards a per-table/index/function solution, there should be an easy way to see how those hints are affecting the system and to report that data back to the community. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] array_accum aggregate
* Tom Lane ([EMAIL PROTECTED]) wrote: > (However, now that we support nulls in arrays, meseems a more consistent > definition would be that it allows null inputs and just includes them in > the output. So probably you do need it non-strict.) This was my intention. > I'm inclined to think that this example demonstrates a deficiency in the > aggregate-function design: there should be a way to declare what we're > really doing. But I don't know exactly what that should look like. I agree and would much rather have a clean solution which works with the design than one which has to work outside it. When I first was trying to decide on the state-type I was looking through the PG catalogs for essentially a "complex C type" which translated to a void*. Perhaps such a type could be added. Unless that's considered along the lines of an 'any' type it'd cause problems for the polymorphism aspect. Another alternative would be to provide a seperate area for each aggregate to put any other information it needs. This would almost certainly only be available to C functions but would essentially be a void* which is provided through the AggState structure but tracked by the aggregator routines and reset for each aggregate function being run. If that's acceptable, I don't think it'd be all that difficult to implement. With that, aaccum_sfunc and aaccum_ffunc would ignore the state variable passed to them in favor of their custom structure available through fcinfo->AggState (I expect they'd just keep the state variable NULL and be marked non-strict, or set it to some constant if necessary). The pointer would have to be tracked somewhere and then copied in/out on each call, but that doesn't seem too difficult to me. After all, the state variable is already being tracked somewhere, this would just sit next to it, in my head anyway. I've got some time this weekend and would be happy to take a shot at the second proposal if that's generally acceptable. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] ./configure argument checking
Wasn't configure changed to complain if it's fed a bogus argument? I just did ./configure --with-deps on a fresh checkout and it didn't complain... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ./configure argument checking
Jim C. Nasby wrote: Wasn't configure changed to complain if it's fed a bogus argument? I just did ./configure --with-deps on a fresh checkout and it didn't complain... My recollection was Peter said this was an autoconf "feature". cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ./configure argument checking
On Thu, Oct 12, 2006 at 04:41:14PM -0400, Andrew Dunstan wrote: > Jim C. Nasby wrote: > >Wasn't configure changed to complain if it's fed a bogus argument? I > >just did ./configure --with-deps on a fresh checkout and it didn't > >complain... > > > > My recollection was Peter said this was an autoconf "feature". IIRC it was made a non-fatal warning somewhere near the end of the output, but I'm not sure... Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] [PERFORM] Hints proposal
Jim, > > I don't see how adding extra tags to queries is easier to implement > > than an ability to modify the system catalogs. Quite the opposite, > > really. > > > > And, as I said, if you're going to push for a feature that will be > > obsolesced in one version, then you're going to have a really rocky > > row to hoe. > > Unless you've got a time machine or a team of coders in your back > pocket, I don't see how the planner will suddenly become perfect in > 8.4... Since you're not a core code contributor, I really don't see why you continue to claim that query hints are going to be easier to implement than relation-level statistics modification. You think it's easier, but the people who actually work on the planner don't believe that it is. > We've been seeing the same kinds of problems that are very difficult (or > impossible) to fix cropping up for literally years... it'd be really > good to at least be able to force the planner to do the sane thing even > if we don't have the manpower to fix it right now... As I've said to other people on this thread, you keep making the incorrect assumption that Oracle-style query hints are the only possible way of manual nuts-and-bolts query tuning. They are not. > > I actually think the way to attack this issue is to discuss the kinds > > of errors the planner makes, and what tweaks we could do to correct > > them. Here's the ones I'm aware of: > > > > -- Incorrect selectivity of WHERE clause > > -- Incorrect selectivity of JOIN > > -- Wrong estimate of rows returned from SRF > > -- Incorrect cost estimate for index use > > > > Can you think of any others? > > There's a range of correlations where the planner will incorrectly > choose a seqscan over an indexscan. Please list some if you have ones which don't fall into one of the four problems above. > Function problems aren't limited to SRFs... we have 0 statistics ability > for functions. > > There's the whole issue of multi-column statistics. Sure, but again that falls into the category of "incorrect selectivity for WHERE/JOIN". Don't make things more complicated than they need to be. > Well, one nice thing about the per-query method is you can post before > and after EXPLAIN ANALYZE along with the hints. One bad thing is that application designers will tend to use the hint, fix the immediate issue, and never report a problem at all. And query hints would not be collectable in any organized way except the query log, which would then require very sophisticated text parsing to get any useful information at all. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] Hints proposal
> > Well, one nice thing about the per-query method is you can post before > > and after EXPLAIN ANALYZE along with the hints. > > One bad thing is that application designers will tend to use the hint, fix > the immediate issue, and never report a problem at all. And query hints > would not be collectable in any organized way except the query log, which > would then require very sophisticated text parsing to get any useful > information at all. > Or they'll report it when the next version of Postgres "breaks" their app because the hints changed, or because the planner does something else which makes those hints obsolete. My main concern with hints (aside from the fact I'd rather see more intelligence in the planner/stats) is managing them appropriately. I have two general types of SQL where I'd want to use hints- big OLAP stuff (where I have a lot of big queries, so it's not just one or two where I'd need them) or large dynamically generated queries (Users building custom queries). Either way, I don't want to put them on a query itself. What about using regular expressions, plus, if you have a function (views, or any other statement that is stored), you can assign a rule to that particular function. So you get matching, plus explicit selection. This way it's easy to find all your hints, turn them off, manage them, etc. (Not to mention dynamically generated SQL is ugly enough without having to put hints in there). - Bucky ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Hints WAS: Index Tuning Features
Martijn van Oosterhout writes: > Which could then be used by the planner. Or more directly: > > CREATE HISTOGRAM FOR FUNCTION verify_pk_signature(documenent) > AS ( true = 99, false = 1 ); > > (Perhaps DECLARE is the better phrase?). Except that the distribution is a property of the values you're passing it, not the function itself. In theory verify_pk_signature() returns false for 99.999...% of its inputs. But of course unless you have a black hat or hardware problems you're not going to ever pass it any input that makes it return false. The query may be a routine session cookie check where it will virtually always return true, or it may be a DBA running an ad-hoc query to check suspicious records for invalid data. It may even be the same query from the same object method being called from different call sites in the application. I'm not saying the above isn't a good idea though. I rather like it actually. But the point of my example originally was specifically to show how at least sometimes the *only* place the knowledge of the data distribution lies is in the programmer's head. -- greg ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] On status data and summaries
Andrew Sullivan wrote: > On Thu, Oct 12, 2006 at 01:56:37PM -0500, Jim C. Nasby wrote: > > *ahem* nudge people to update the status of what they're working on once > > a month. > > Well, though, remember that the point of this was supposed to be to > make things easier for the developers, who are already spending (it > would seem) too many cycles keeping on top of this. Or maybe I just > misunderstood what the problem was people were having. No, my point was that right now I only do it during feature freeze, so we know what has to happen to get to beta (and that seems to work well). What I think people wanted was something like that, but maintained during the development cycle, so they would know what features our being worked on, and by whom. One great thing about the list I maintain is that it is a flat text file, so I can update it in seconds. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] Hints proposal
[ trying once again to push this thread over to -hackers where it belongs ] Arjen van der Meijden <[EMAIL PROTECTED]> writes: > On 12-10-2006 21:07 Jeff Davis wrote: >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote: >> To formalize the proposal a litte, you could have syntax like: >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint; >> >> Where "some_hint" would be a hinting language perhaps like Jim's, except >> not guaranteed to be compatible between versions of PostgreSQL. The >> developers could change the hinting language at every release and people >> can just re-write the hints without changing their application. Do you have any idea how much push-back there would be to that? In practice we'd be bound by backwards-compatibility concerns for the hints too. > There are some disadvantages of not writing the hints in a query. But of > course there are disadvantages to do as well ;) > One I can think of is that it can be very hard to define which hint > should apply where. Especially in complex queries, defining at which > point exaclty you'd like your hint to work is not a simple matter, > unless you can just place a comment right at that position. The problems that you are seeing all come from the insistence that a hint should be textually associated with a query. Using a regex is a little better than putting it right into the query, but the only thing that really fixes is not having the hints directly embedded into client-side code. It's still wrong at the conceptual level. The right way to think about it is to ask why is the planner not picking the right plan to start with --- is it missing a statistical correlation, or are its cost parameters wrong for a specific case, or is it perhaps unable to generate the desired plan at all? (If the latter, no amount of hinting is going to help.) If it's a statistics or costing problem, I think the right thing is to try to fix it with hints at that level. You're much more likely to fix the behavior across a class of queries than you will be with a hint textually matched to a specific query. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] create temp table .. on commit delete rows
David Fetter wrote: > On Thu, Oct 12, 2006 at 02:07:28PM -0500, Jim C. Nasby wrote: > > On Thu, Oct 12, 2006 at 12:01:12PM -0700, David Fetter wrote: > > > On Thu, Oct 12, 2006 at 03:51:39PM +0400, Teodor Sigaev wrote: > > > > >You're running in auto-commit, mode. An implicit commit > > > > >happens after this statement. Which clears the table. Looks > > > > >right to me. > > > > > > > > Oops, I see > > > > > > Should something notice and raise a warning when people create a > > > TEMP table and have AUTOCOMMIT on? > > > > Maybe if ON COMMIT is set to DELETE ROWS or DROP... > > Sounds good :) Added to TODO: o Issue a notice if CREATE TABLE ... ON COMMIT { DELETE ROWS | DROP } is issued outside a multi-statement transaction -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] Hints proposal
Bucky Jordan wrote: > What about using regular expressions, plus, if you have a function > (views, or any other statement that is stored), you can assign a rule to > that particular function. So you get matching, plus explicit selection. > This way it's easy to find all your hints, turn them off, manage them, > etc. (Not to mention dynamically generated SQL is ugly enough without > having to put hints in there). The regular expression idea that's being floated around makes my brain feel like somebody is screeching a blackboard nearby. I don't think it's a sane idea. I think you could achieve something similar by using stored plan representations, like we do for rewrite rules. So you'd look for, say, a matching join combination in a catalog, and get a selectivity from a function that would get the selectivities of the conditions on the base tables. Or something like that anyway. That gets ugly pretty fast when you have to extract selectivities for all the possible join paths in any given query. But please don't talk about regular expressions. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New version of money type
This thread has been saved for the 8.3 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- D'Arcy J.M. Cain wrote: > On Thu, 12 Oct 2006 14:17:33 -0400 > Tom Lane <[EMAIL PROTECTED]> wrote: > > "D'Arcy J.M. Cain" writes: > > > Cool. So what do I do with the patch? Should I add the currency > > > symbol back in and commit or should I resubmit the patch to hackers for > > > further review? > > > > Well, one thing you definitely *don't* do is commit right now, because > > we're in feature freeze, not to mention trying to avoid forced initdbs > > now that beta has started. Sit on it till 8.3 is branched, and > > OK. I hadn't thought of it as a new feature per se but I understand > the initdb issue. Holding at 30,000 feet, ground control. > > > meanwhile think about what you want to do with the currency-symbol > > issue... > > Personally I don't see a need for it but I am currently in favour of > adding it back in before committing just so that we can deal with the > issue separately. The same as the other changes being discussed. > > -- > D'Arcy J.M. Cain | Democracy is three wolves > http://www.druid.net/darcy/| and a sheep voting on > +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers
I wrote: > ISTM that ideally, a query with RETURNING ought to act like a SELECT > for the purposes of a SQL function --- to wit, that the result rows are > discarded if it's not the last query in the function, and are returned > as the function result if it is. The current state of affairs is that the first part of that works as expected, and the second part fails like so: regression=# create function foo8(bigint,bigint) returns setof int8_tbl as $$ insert into int8_tbl values($1,$2) returning * $$ language sql; ERROR: return type mismatch in function declared to return int8_tbl DETAIL: Function's final statement must be a SELECT. CONTEXT: SQL function "foo8" regression=# While this is certainly undesirable, it looks more like a missing feature than a bug, especially since the documentation says exactly that: ... the final command must be a SELECT that returns whatever is specified as the function's return type. I spent some time looking at what it would take to fix it, and I find that the changes are a bit bigger than I want to be making in mid-beta. So my recommendation is that for now we just add a TODO item: * Allow SQL-language functions to return results from RETURNING queries regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] Hints proposal
On Thu, 2006-10-12 at 17:28 -0400, Tom Lane wrote: > [ trying once again to push this thread over to -hackers where it belongs ] > > Arjen van der Meijden <[EMAIL PROTECTED]> writes: > > On 12-10-2006 21:07 Jeff Davis wrote: > >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote: > >> To formalize the proposal a litte, you could have syntax like: > >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint; > >> > >> Where "some_hint" would be a hinting language perhaps like Jim's, except > >> not guaranteed to be compatible between versions of PostgreSQL. The > >> developers could change the hinting language at every release and people > >> can just re-write the hints without changing their application. > > Do you have any idea how much push-back there would be to that? In > practice we'd be bound by backwards-compatibility concerns for the hints > too. > No, I don't have any idea, except that it would be less push-back than changing a language that's embedded in client code. Also, I see no reason to think that a hint would not be obsolete upon a new release anyway. > The problems that you are seeing all come from the insistence that a > hint should be textually associated with a query. Using a regex is a > little better than putting it right into the query, but the only thing "Little better" is all I was going for. I was just making the observation that we can separate two concepts: (1) Embedding code in the client's queries, which I see as very undesirable and unnecessary (2) Providing very specific hints which at least gives us a place to talk about the debate more reasonably. > that really fixes is not having the hints directly embedded into > client-side code. It's still wrong at the conceptual level. > I won't disagree with that. I will just say it's no more wrong than applying the same concept in addition to embedding the hints in client queries. > The right way to think about it is to ask why is the planner not picking > the right plan to start with --- is it missing a statistical > correlation, or are its cost parameters wrong for a specific case, or > is it perhaps unable to generate the desired plan at all? (If the > latter, no amount of hinting is going to help.) If it's a statistics or > costing problem, I think the right thing is to try to fix it with hints > at that level. You're much more likely to fix the behavior across a > class of queries than you will be with a hint textually matched to a > specific query. > Agreed. Regards, Jeff Davis ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] create temp table .. on commit delete rows
Bruce Momjian <[EMAIL PROTECTED]> writes: > David Fetter wrote: >>> Should something notice and raise a warning when people create a >>> TEMP table and have AUTOCOMMIT on? > Added to TODO: > o Issue a notice if CREATE TABLE ... ON COMMIT { DELETE ROWS | > DROP } is issued outside a multi-statement transaction That is *not* what was suggested, and it doesn't seem very useful. The problem really comes when one uses a temp table in autocommit mode, not at creation time. The problem with the original suggestion is that the backend can't do it because AUTOCOMMIT is a notion that exists only in the client-side code. And the client can't do it very well because it'd have to parse SQL commands, and even with that it wouldn't see CREATE TEMP TABLE commands issued inside functions. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] create temp table .. on commit delete rows
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > David Fetter wrote: > >>> Should something notice and raise a warning when people create a > >>> TEMP table and have AUTOCOMMIT on? > > > Added to TODO: > > o Issue a notice if CREATE TABLE ... ON COMMIT { DELETE ROWS | > > DROP } is issued outside a multi-statement transaction > > That is *not* what was suggested, and it doesn't seem very useful. The > problem really comes when one uses a temp table in autocommit mode, not > at creation time. > > The problem with the original suggestion is that the backend can't do it > because AUTOCOMMIT is a notion that exists only in the client-side code. > And the client can't do it very well because it'd have to parse SQL > commands, and even with that it wouldn't see CREATE TEMP TABLE commands > issued inside functions. Ewe. Yea, I will just remove it. We can't issue a warning easily. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [COMMITTERS] pgsql: Stamp 7.3.16.
Tom Lane wrote: > Log Message: > --- > Stamp 7.3.16. > > Tags: > > REL7_3_STABLE > > Modified Files: > -- > pgsql: > configure.in (r1.217.2.24 -> r1.217.2.25) > > (http://developer.postgresql.org/cvsweb.cgi/pgsql/configure.in.diff?r1=1.217.2.24&r2=1.217.2.25) > configure (r1.226.2.26 -> r1.226.2.27) > > (http://developer.postgresql.org/cvsweb.cgi/pgsql/configure.diff?r1=1.226.2.26&r2=1.226.2.27) Uh, I thought only Marc was supposed to do that before packaging? I was skipping it. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] ./configure argument checking
Martijn van Oosterhout wrote: -- Start of PGP signed section. > On Thu, Oct 12, 2006 at 04:41:14PM -0400, Andrew Dunstan wrote: > > Jim C. Nasby wrote: > > >Wasn't configure changed to complain if it's fed a bogus argument? I > > >just did ./configure --with-deps on a fresh checkout and it didn't > > >complain... > > > > > > > My recollection was Peter said this was an autoconf "feature". > > IIRC it was made a non-fatal warning somewhere near the end of the > output, but I'm not sure... It spits out this line just before it creates its output files: *** Option ignored: --with-lkjasdf -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [COMMITTERS] pgsql: Stamp 7.3.16.
Bruce Momjian <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Stamp 7.3.16. > Uh, I thought only Marc was supposed to do that before packaging? I was > skipping it. I've done it the last few times for back-branch releases --- I was under the impression that Marc didn't have an installed copy anymore of the old autoconf required to do it correctly for pre-8.1 branches. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [COMMITTERS] pgsql: Stamp 7.3.16.
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> Stamp 7.3.16. > > > Uh, I thought only Marc was supposed to do that before packaging? I was > > skipping it. > > I've done it the last few times for back-branch releases --- I was under > the impression that Marc didn't have an installed copy anymore of the > old autoconf required to do it correctly for pre-8.1 branches. Perhaps. I only remember the discussion that I wasn't supposed to do it as part of my general stamping. Perhaps the idea was that it should be done on the day it is packaged, while I usually stamp several days before that. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ./configure argument checking
Bruce Momjian <[EMAIL PROTECTED]> writes: > Martijn van Oosterhout wrote: >> IIRC it was made a non-fatal warning somewhere near the end of the >> output, but I'm not sure... > It spits out this line just before it creates its output files: > *** Option ignored: --with-lkjasdf Of course, since it spits out pages and pages of normally-useless trivia, we've all become conditioned to ignore configure's output as long as it doesn't actually fail :-( Not sure what to do about that --- I doubt that raising this warning to error would be a good idea, seeing how firmly the upstream developers believe it shouldn't even be a warning. Is there any sort of "quiet mode" possible that would report only warnings? Would it be a good idea if it were possible? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Modification to the postgres catalog
Tom, Hi. Sorry that i can't response your message soon... i lost my internet connection...But you were right... I forgot to modify the relnatts of the pg_class table in DATA line for it... that was crashing the "initdb" command Thanks... if you don't remind me of that, i would never see it...Bye. Thanks, again...On 10/11/06, Tom Lane < [EMAIL PROTECTED]> wrote:"Carlos Chacon" < [EMAIL PROTECTED]> writes:> But i modify too Natts_pg_class and the Anum macro...Only I forgot> mentionated it in the last mail. i put:OK ... did you add a suitable initial value to each of the DATA lines in pg_class.h? Did you remember to adjust pg_class's own relnatts fieldappearing in the DATA line for it?You could try looking at one of the past commits that has added a columnto pg_class, and make sure you touched all the places it did. regards, tom lane
Re: [HACKERS] ./configure argument checking
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Martijn van Oosterhout wrote: > >> IIRC it was made a non-fatal warning somewhere near the end of the > >> output, but I'm not sure... > > > It spits out this line just before it creates its output files: > > *** Option ignored: --with-lkjasdf > > Of course, since it spits out pages and pages of normally-useless trivia, > we've all become conditioned to ignore configure's output as long as it > doesn't actually fail :-( > > Not sure what to do about that --- I doubt that raising this warning to > error would be a good idea, seeing how firmly the upstream developers > believe it shouldn't even be a warning. Is there any sort of "quiet > mode" possible that would report only warnings? Would it be a good idea > if it were possible? I think one idea is a "pedantic" mode that fails if an unrecognized option is supplied. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: Hints (Was: [HACKERS] Index Tuning Features)
On Oct 12, 2006, at 4:26 AM, Andrew Sullivan wrote: On Thu, Oct 12, 2006 at 08:34:45AM +0200, Florian Weimer wrote: Some statistics are very hard to gather from a sample, e.g. the number of distinct values in a column. Then how can the DBA know it, either? The problem with this sort of argument is always that people are claiming some special knowledge is available to the DBA. If it's true that the DBA really _can_ know this stuff, then there must be some way to learn it. Which means that you can, in principle, figure out ways to communicate that to the optimizer. Yes, but it may be much more efficient for the human to tell the computer than for the computer to introspect things. Take, for example, ndisinct as data grows large. I, the database designer, may know (or simply see) that a certain foreign key column will have roughly a certain cardinality regardless of how big the table gets. It's a lot more efficient for me to tell the system that up front then have it need to do a full table scan or tens of millions of rows periodically to figure it out, or worse--as it is currently--to come up with an estimate that is multiple orders of magnitude off, even with the stats target turned all the way up. I realize that this is a case that is possible to do manually now, sort of. I can tweak the stats table myself. But it would be nice if you could do it in such a way that it would override what analyze comes up with on a case-by-case basis. We could have a perfect query planner, but feed it bad stats and it will still make poor decisions. I'm of the strong opinion that hinting the data is much better than hinting the queries. There tends to be many fewer places you need to do that, and new queries can automatically take advantage. I like the suggestion, though, that there be ways to codify known relationships in the system in such a way that the optimizer can learn to use that information. _That_ seems to me to be a big improvement, because it can be taken into consideration along with relationships that emerge from the statistics, that the DBA may not know about. I'm all for things the computer can do for me automagically. It's just good to have the ability to tell the computer about things you know about the data that it either can't efficiently figure out or can't figure out at all. -Casey ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ./configure argument checking
Bruce Momjian wrote: > Tom Lane wrote: >> Bruce Momjian <[EMAIL PROTECTED]> writes: >> > Martijn van Oosterhout wrote: >> >> IIRC it was made a non-fatal warning somewhere near the end of the >> >> output, but I'm not sure... >> >> > It spits out this line just before it creates its output files: >> >*** Option ignored: --with-lkjasdf >> >> Of course, since it spits out pages and pages of normally-useless >> trivia, >> we've all become conditioned to ignore configure's output as long as it >> doesn't actually fail :-( >> >> Not sure what to do about that --- I doubt that raising this warning to >> error would be a good idea, seeing how firmly the upstream developers >> believe it shouldn't even be a warning. Is there any sort of "quiet >> mode" possible that would report only warnings? Would it be a good idea >> if it were possible? > > I think one idea is a "pedantic" mode that fails if an unrecognized > option is supplied. > I do not see any point at all in a special mode. If you know enough to want to use it you should be able to protect yourself more directly from needing it, simply by taking care to use correct switches. Frankly, I'd let sleeping dogs lie, in this case. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Hints (Was: [HACKERS] Index Tuning Features)
Casey Duncan <[EMAIL PROTECTED]> writes: > Yes, but it may be much more efficient for the human to tell the > computer than for the computer to introspect things. Take, for > example, ndisinct as data grows large. Yeah, an override estimate for a column's ndistinct seems a perfect example of the sort of statistical hint that I'd be in favor of having. We have also talked about solving the multi-column statistics problem (which, at its core, is "which combinations of columns are worth accumulating stats for?" --- you can't possibly store stats for every combination!) by having what would amount to hints from the DBA saying "keep stats for these combinations". regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] array_accum aggregate
Stephen Frost <[EMAIL PROTECTED]> writes: > Another alternative would be to provide a seperate area for each > aggregate to put any other information it needs. I'm not convinced that that's necessary --- the cases we have at hand suggest that the transition function is perfectly capable of doing the storage management it wants. The problem is how to declare to CREATE AGGREGATE that we're using a transition function of this kind rather than the "stupid" functions it expects. When the function is doing its own storage management, we'd really rather that nodeAgg.c stayed out of the way and didn't try to do any datum copying at all; having it copy a placeholder bytea or anyarray or whatever is really a waste of cycles, not to mention obscuring what is going on. If nodeAgg just provided a pass-by-value Datum, which the transition function could use to store a pointer to storage it's handling, things would be a lot cleaner. After a little bit of thought I'm tempted to propose that we handle this by inventing a new pseudotype called something like "aggregate_state", which'd be declared in the catalogs as pass-by-value, thereby suppressing useless copying activity in nodeAgg.c. You'd declare the aggregate as having stype = aggregate_state, and the transition function would have signature sfunc(aggregate_state, ... aggregate-input-type(s) ...) returns aggregate_state and the final function of course ffunc(aggregate_state) returns aggregate-result-type aggregate_state would have no other uses in the system, and its input and output functions would raise an error, so type safety is assured --- there would be no way to call either the sfunc or ffunc "manually", except by passing a NULL value, which should be safe because that's what they'd expect as the aggregate initial condition. One advantage of doing it this way is that the planner could be taught to recognize aggregates with stype = aggregate_state specially, and make allowance for the fact that they'll use more workspace than meets the eye. If we don't have something like this then the planner is likely to try to use hash aggregation in scenarios where it'd be absolutely fatal to do so. I'm not sure whether we'd want to completely forbid hash aggregation when any stype = aggregate_state is present, but for sure we want to assume that there's some pretty large amount of per-aggregate state we don't know about. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] array_accum aggregate
I wrote: > aggregate_state would have no other uses in the system, and its input > and output functions would raise an error, so type safety is assured > --- there would be no way to call either the sfunc or ffunc "manually", > except by passing a NULL value, which should be safe because that's what > they'd expect as the aggregate initial condition. Um, no, I take that back, unless you want to invent a separate pseudotype for each such aggregate. Otherwise you can crash it with my_ffunc(your_sfunc(null, whatever)) because my_ffunc will be expecting a datastructure different from what it gets. Maybe having a check for AggState call context is enough of a defense for that, but I'm not really satisfied. Back to the drawing board ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: Hints (Was: [HACKERS] Index Tuning Features)
Tom Lane wrote: > We have also talked about solving the multi-column statistics problem > (which, at its core, is "which combinations of columns are worth > accumulating stats for?" --- you can't possibly store stats for every > combination!) by having what would amount to hints from the DBA saying > "keep stats for these combinations". > This strikes me intuitively as the most likely candidate so far for improvement. I'm much more interested in schemes that will improve the stats system, rather than providing a way around it. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] Hints proposal
Quoth [EMAIL PROTECTED] (Jeff Davis): > On Thu, 2006-10-12 at 17:28 -0400, Tom Lane wrote: >> [ trying once again to push this thread over to -hackers where it belongs ] >> >> Arjen van der Meijden <[EMAIL PROTECTED]> writes: >> > On 12-10-2006 21:07 Jeff Davis wrote: >> >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote: >> >> To formalize the proposal a litte, you could have syntax like: >> >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint; >> >> >> >> Where "some_hint" would be a hinting language perhaps like >> >> Jim's, except not guaranteed to be compatible between versions >> >> of PostgreSQL. The developers could change the hinting language >> >> at every release and people can just re-write the hints without >> >> changing their application. >> >> Do you have any idea how much push-back there would be to that? In >> practice we'd be bound by backwards-compatibility concerns for the >> hints too. > > No, I don't have any idea, except that it would be less push-back > than changing a language that's embedded in client code. Also, I see > no reason to think that a hint would not be obsolete upon a new > release anyway. I see *plenty* of reason. 1. Suppose the scenario where Hint h was useful hasn't been affected by *any* changes in how the query planner works in the new version, it *obviously* continues to be necessary. 2. If Version n+0.1 hasn't resolved all/most cases where Hint h was useful in Version n, then people will entirely reasonably expect for Hint h to continue to be in effect in version n+0.1 3. Suppose support for Hint h is introduced in PostgreSQL version n, and an optimization that makes it obsolete does not arrive until version n+0.3, which is quite possible. That hint has been carried forward for 2 versions already, long enough for client code that contains it to start to ossify. (After all, if developers get promoted to new projects every couple of years, two versions is plenty of time for the original programmer to be gone...) That's not just one good reason, but three. >> The problems that you are seeing all come from the insistence that a >> hint should be textually associated with a query. Using a regex is a >> little better than putting it right into the query, but the only thing > > "Little better" is all I was going for. I was just making the > observation that we can separate two concepts: > (1) Embedding code in the client's queries, which I see as very > undesirable and unnecessary > (2) Providing very specific hints > > which at least gives us a place to talk about the debate more > reasonably. It seems to me that there is a *LOT* of merit in trying to find alternatives to embedding code into client queries, to be sure. >> that really fixes is not having the hints directly embedded into >> client-side code. It's still wrong at the conceptual level. > > I won't disagree with that. I will just say it's no more wrong than > applying the same concept in addition to embedding the hints in client > queries. > >> The right way to think about it is to ask why is the planner not >> picking the right plan to start with --- is it missing a >> statistical correlation, or are its cost parameters wrong for a >> specific case, or is it perhaps unable to generate the desired plan >> at all? (If the latter, no amount of hinting is going to help.) >> If it's a statistics or costing problem, I think the right thing is >> to try to fix it with hints at that level. You're much more likely >> to fix the behavior across a class of queries than you will be with >> a hint textually matched to a specific query. > > Agreed. That's definitely a useful way to look at the issue, which seems to be lacking in many of the cries for hints. Perhaps I'm being unfair, but it often seems that people demanding hinting systems are uninterested in why the planner is getting things wrong. Yes, they have an immediate problem (namely the wrong plan that is getting generated) that they want to resolve. But I'm not sure that you can get anything out of hinting without coming close to answering "why the planner got it wrong." -- "cbbrowne","@","gmail.com" http://linuxfinances.info/info/lsf.html "Optimization hinders evolution." -- Alan Perlis ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Fwd: pg_dump VS alter database ... set search_path ...
Tom, Can you please suggest a good practice how to propagate such DB settings into dumps? I also suffer from this: my DB currently have 5 schemas and application strongly depends on the search_path. I cannot dump whole cluster, I need only 1 specific database. At this moment I use ugly solution and store search_path setting as per-user settings in my secondary databases. Solution of Nikolay, being improved for backward compatibility (additional switch for pg_dump to include alter database statements with these settings into sql dump generated) would fit me perfectly. But unfortunately you're not constructive in your critics here and do not propose a way to solve the problem, only saying that this (very useful and awaited option!) is ugly. With approach like this the community will wait for the solution for ages. :-( On 10/9/06, Tom Lane <[EMAIL PROTECTED]> wrote: "Nikolay Samokhvalov" <[EMAIL PROTECTED]> writes: > What is the reason to not include database settings (like search_path) > to database dump created with "pg_dump -C"? Duplication of code and functionality with pg_dumpall. I'd want to see some thought about how to resolve that, not just a quick copy-some-code- from-pg_dumpall-into-pg_dump. You also need to explain why this issue should be treated differently from users and groups ... a dump won't restore correctly without that supporting context either. I have no objection to rethinking the division of labor between the two programs, but let's end up with something that's cleaner not uglier. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Documentation fix for --with-ldap
>> Actually the patch was previously rejected. > > Oh? Sorry, I must have missed that. On what grounds was it rejected? Because it was decided that hyperlinks are undesirable in this place. Instead, a simpler version of the patch was applied. See http://archives.postgresql.org/pgsql-hackers/2006-09/msg00383.php Yours, and thanks, Laurenz Albe ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly