Re: [HACKERS] An Idea for planner hints

2006-08-26 Thread Hayes
On Aug 17, 2006, at 1:41 PM, Peter Eisentraut wrote: But we need to work this from the other end anyway. We need to determine first, what sort of statistics the planner could make use of. Then we can figure out the difficulties in collecting them. There are still some things that the

Re: [HACKERS] An Idea for planner hints

2006-08-24 Thread Mark Dilger
Is there actually evidence that there's a lot of problems with bad join orders? ISTM that's one of the areas where the planner actually does a pretty good job. I put together a quick demonstration using AxBxC where AxB is empty but AxC is not. Sure enough, postgres chooses AxC first, then xB,

Re: [HACKERS] An Idea for planner hints

2006-08-23 Thread Jim C. Nasby
On Tue, Aug 22, 2006 at 11:56:17AM -0700, Mark Dilger wrote: I proposed something like this quite a bit up-thread. I was hoping we could have a mode in which the system would run the second, third, fourth, ... best plans rather than just the best looking one, and then determine from actual

Re: [HACKERS] An Idea for planner hints

2006-08-23 Thread Mark Dilger
Jim C. Nasby wrote: On Tue, Aug 22, 2006 at 11:56:17AM -0700, Mark Dilger wrote: I proposed something like this quite a bit up-thread. I was hoping we could have a mode in which the system would run the second, third, fourth, ... best plans rather than just the best looking one, and then

Re: [HACKERS] An Idea for planner hints

2006-08-22 Thread Mark Dilger
Peter Eisentraut wrote: Jim C. Nasby wrote: Meet EXPLAIN ANALYZE. Which does no good for apps that you don't control the code on. Even if you do control the code, you have to find a way to stick EXPLAIN ANALYZE in front of every query, and figure out how to deal with what's comming back.

Re: [HACKERS] An Idea for planner hints

2006-08-17 Thread Peter Eisentraut
Gregory Stark wrote: I'm not sure it's worth throwing out the more user-friendly interface we have now but I think it's clear that a table is the obvious machine-readable format if you're already sitting in an SQL database... :) Then again, a table might not be the optimal format for an

Re: [HACKERS] An Idea for planner hints

2006-08-17 Thread Greg Stark
Peter Eisentraut [EMAIL PROTECTED] writes: Gregory Stark wrote: I'm not sure it's worth throwing out the more user-friendly interface we have now but I think it's clear that a table is the obvious machine-readable format if you're already sitting in an SQL database... :) Then again,

Re: [HACKERS] An Idea for planner hints

2006-08-17 Thread Arturo Pérez
On Aug 15, 2006, at 10:40 AM, Jim C. Nasby wrote: On Mon, Aug 14, 2006 at 11:41:29PM +0300, Hannu Krosing wrote: ??hel kenal p??eval, E, 2006-08-14 kell 18:21, kirjutas Peter Eisentraut: Perez wrote: I thought, from watching the list for a while, that the planner statistics needed were

Re: [HACKERS] An Idea for planner hints

2006-08-17 Thread Peter Eisentraut
Arturo Pérez wrote: The DBA therefore pokes the right information into the planner's statistical tables (or, perhaps, a more human- manageable one that gets compiled into the planner's stats). I think we're perfectly capable of producing a system that can collect the statistics. We just

Re: [HACKERS] An Idea for planner hints

2006-08-17 Thread Florian G. Pflug
Peter Eisentraut wrote: Arturo Pérez wrote: The DBA therefore pokes the right information into the planner's statistical tables (or, perhaps, a more human- manageable one that gets compiled into the planner's stats). I think we're perfectly capable of producing a system that can collect the

Re: [HACKERS] An Idea for planner hints

2006-08-16 Thread Gregory Stark
Jim C. Nasby [EMAIL PROTECTED] writes: On Tue, Aug 15, 2006 at 07:55:28PM +0200, Peter Eisentraut wrote: It would not be hard to create an auto explain analyze mode that implicitly runs EXPLAIN ANALYZE along with every query and logs the result. On its face, it sounds like an obviously

Re: [HACKERS] An Idea for planner hints

2006-08-16 Thread Jim C. Nasby
On Wed, Aug 16, 2006 at 06:48:09PM -0400, Gregory Stark wrote: Jim C. Nasby [EMAIL PROTECTED] writes: On Tue, Aug 15, 2006 at 07:55:28PM +0200, Peter Eisentraut wrote: It would not be hard to create an auto explain analyze mode that implicitly runs EXPLAIN ANALYZE along with every

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Jim C. Nasby
On Mon, Aug 14, 2006 at 11:41:29PM +0300, Hannu Krosing wrote: ??hel kenal p??eval, E, 2006-08-14 kell 18:21, kirjutas Peter Eisentraut: Perez wrote: I thought, from watching the list for a while, that the planner statistics needed were known but that how to gather the statistics was

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread AgentM
On Aug 15, 2006, at 10:40 , Jim C. Nasby wrote: Yeah, unless someone comes up with some kind of 'magic', I think trying to handle every cross-column possibility is a non-starter. IIRC, that argument is what's stalled cross-column stats every time in the past. It makes a lot more sense to

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Peter Eisentraut
AgentM wrote: I've always found it odd that database didn't determine which statistics are the most interesting from the queries themselves. The overhead of doing that on the fly is probably prohibitive. More explicit profiling support could be helpful, but that would seem a lot more

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread AgentM
On Aug 15, 2006, at 12:26 , Peter Eisentraut wrote: AgentM wrote: I've always found it odd that database didn't determine which statistics are the most interesting from the queries themselves. The overhead of doing that on the fly is probably prohibitive. More explicit profiling support

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Peter Eisentraut
AgentM wrote: Couldn't the session be explicitly transferred into a special analysis mode? Explain analyze could run on every query implicitly and point out time and row count discrepancies as HINTs. Multi-column joins, for example, could be pointed out and display whether or not there are

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Peter Eisentraut
Jim C. Nasby wrote: Meet EXPLAIN ANALYZE. Which does no good for apps that you don't control the code on. Even if you do control the code, you have to find a way to stick EXPLAIN ANALYZE in front of every query, and figure out how to deal with what's comming back. It would not be hard to

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread AgentM
On Aug 15, 2006, at 13:55 , Peter Eisentraut wrote: Jim C. Nasby wrote: Meet EXPLAIN ANALYZE. Which does no good for apps that you don't control the code on. Even if you do control the code, you have to find a way to stick EXPLAIN ANALYZE in front of every query, and figure out how to deal

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Jim C. Nasby
On Tue, Aug 15, 2006 at 07:55:28PM +0200, Peter Eisentraut wrote: Jim C. Nasby wrote: Meet EXPLAIN ANALYZE. Which does no good for apps that you don't control the code on. Even if you do control the code, you have to find a way to stick EXPLAIN ANALYZE in front of every query, and

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Christopher Kings-Lynne
see. Collecting the statistics thereafter isn't that hard, but there needs to be a way to not collect an exponential volume of statistics on all column combinations. You could collect them on all FK relationships - is that enough? Chris ---(end of

Re: [HACKERS] An Idea for planner hints

2006-08-15 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes: see. Collecting the statistics thereafter isn't that hard, but there needs to be a way to not collect an exponential volume of statistics on all column combinations. You could collect them on all FK relationships - is that enough? As

Re: [HACKERS] An Idea for planner hints

2006-08-14 Thread Perez
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Jim C. Nasby) wrote: On Wed, Aug 09, 2006 at 08:31:42AM -0400, Perez wrote: Every once in a while people talk about collecting better statistics, correlating multi-column correlations etc. But there never seems to be a way to collect

Re: [HACKERS] An Idea for planner hints

2006-08-14 Thread Peter Eisentraut
Perez wrote: I thought, from watching the list for a while, that the planner statistics needed were known but that how to gather the statistics was not? I think over the course of the discussion we have figured out that we would like to have cross-column correlation statistics. The precise

Re: [HACKERS] An Idea for planner hints

2006-08-14 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-08-14 kell 18:21, kirjutas Peter Eisentraut: Perez wrote: I thought, from watching the list for a while, that the planner statistics needed were known but that how to gather the statistics was not? I think over the course of the discussion we have figured out

Re: [HACKERS] An Idea for planner hints

2006-08-13 Thread Perez
In article [EMAIL PROTECTED], Perez [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Tom Lane) wrote: Martijn van Oosterhout kleptog@svana.org writes: My main problem is that selectivity is the wrong measurement. What users really want to be able to

Re: [HACKERS] An Idea for planner hints

2006-08-13 Thread Jim C. Nasby
On Wed, Aug 09, 2006 at 08:31:42AM -0400, Perez wrote: Every once in a while people talk about collecting better statistics, correlating multi-column correlations etc. But there never seems to be a way to collect that data/statistics. Would it be possible to determine the additional

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Csaba Nagy
On Tue, 2006-08-08 at 22:14, Tom Lane wrote: So some kind of override for statistical guesses doesn't seem completely silly to me. But it needs to be declarative information that's stored somewhere out of view of the actual SQL queries. IMHO anyway. The real problem is that sometimes there's

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Florian G. Pflug
Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: ISTM theat the easiest way would be to introduce a sort of predicate like so: SELECT * FROM foo, bar WHERE pg_selectivity(foo.a = bar.a, 0.1); The one saving grace of Florian's proposal was that you could go hack the

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Martijn van Oosterhout
On Wed, Aug 09, 2006 at 12:57:39PM +0200, Florian G. Pflug wrote: Fixing the generic problem is surely the best _if_ there is a fix for the generic problem at all. But if your where-conditions involves fields from 10 different tables, then IMHO there is no way to _ever_ guarantee that postgres

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes: Image a complex, autogenerated query with looks something like this select from t1 join t2 on ... join t3 on ... join t4 on ... ... ... where big, complicated expression derived from some user input. This big, complicated expression

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Kaare Rasmussen
SELECT * FROM foo, bar WHERE pg_selectivity(foo.a = bar.a, 0.1); ISTM that you introduced the Oracle silliness again, putting the hint into the query. My suggestion would be to tell about it separately. Something like CREATE HINT FOR JOIN foo, bar ON foo.a=bar.b AS some hint; This way hints

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Florian G. Pflug
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: Image a complex, autogenerated query with looks something like this select from t1 join t2 on ... join t3 on ... join t4 on ... ... ... where big, complicated expression derived from some user input. This big, complicated

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Jim C. Nasby
On Wed, Aug 09, 2006 at 02:02:10PM +0200, Martijn van Oosterhout wrote: On Wed, Aug 09, 2006 at 12:57:39PM +0200, Florian G. Pflug wrote: Fixing the generic problem is surely the best _if_ there is a fix for the generic problem at all. But if your where-conditions involves fields from 10

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: On Wed, Aug 09, 2006 at 02:02:10PM +0200, Martijn van Oosterhout wrote: Perhaps the way to go would be to allow users to declare columns often used together and have ANALYSE collect information on correlation which can be used later... One thing that

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Mark Dilger
Jim C. Nasby wrote: Been suggested before... the problem is actually doing something useful with all that data that's collected, as well as how to collect it without greatly impacting the system. Identifying the best plan by means of actually running multiple plans and timing them is useful.

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Joshua Reich
(Not sure how we'd implement that, seeing that ANALYZE currently works on one table at a time, but it's probably doable --- and it'd fix the fundamental problem for correlation statistics, which is how not to try to collect stats about an exponential number of combinations ...) An exponential

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Martijn van Oosterhout
On Wed, Aug 09, 2006 at 03:33:21PM -0400, Joshua Reich wrote: (Not sure how we'd implement that, seeing that ANALYZE currently works on one table at a time, but it's probably doable --- and it'd fix the fundamental problem for correlation statistics, which is how not to try to collect stats

Re: [HACKERS] An Idea for planner hints

2006-08-09 Thread Perez
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Tom Lane) wrote: Martijn van Oosterhout kleptog@svana.org writes: My main problem is that selectivity is the wrong measurement. What users really want to be able to communicate is: 1. If you join tables a and b on x, the number of

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Mark Dilger
If this feature I'm proposing already exists, sorry for the waste of bandwidth, and could someone please point me to it? :) What if there were a mode that told postgres to do an exhaustive search (or if not exhaustive, then much more extensive search) of all plans (or many plans), trying each

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Martijn van Oosterhout
On Tue, Aug 08, 2006 at 01:55:35PM +0200, Florian G. Pflug wrote: Hi Since the discussion about how to force a specific plan has come up, I though I'd post an idea I had for this a while ago. It's not reall well though out yet, but anyway. snip Image a query like select ... from t1 join

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Richard Huxton
Martijn van Oosterhout wrote: On Tue, Aug 08, 2006 at 01:55:35PM +0200, Florian G. Pflug wrote: Hi Since the discussion about how to force a specific plan has come up, I though I'd post an idea I had for this a while ago. It's not reall well though out yet, but anyway. snip Image a query

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: ISTM theat the easiest way would be to introduce a sort of predicate like so: SELECT * FROM foo, bar WHERE pg_selectivity(foo.a = bar.a, 0.1); The one saving grace of Florian's proposal was that you could go hack the statistics *without*

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Martijn van Oosterhout
On Tue, Aug 08, 2006 at 04:14:45PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: ISTM theat the easiest way would be to introduce a sort of predicate like so: SELECT * FROM foo, bar WHERE pg_selectivity(foo.a = bar.a, 0.1); The one saving grace of Florian's

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Mark Dilger
Tom Lane wrote: The thing I object to about the I want to decorate my queries with planner hints mindset is that it's coming at it from the wrong direction. You should never be thinking in terms of fix this one query, because that just leads back into the same dead end that your fix doesn't

Re: [HACKERS] An Idea for planner hints

2006-08-08 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: My main problem is that selectivity is the wrong measurement. What users really want to be able to communicate is: 1. If you join tables a and b on x, the number of resulting rows will be the number of roows selected from b (since b.x id a