Marc Cousin <cousinm...@gmail.com> writes: > On 27/02/2015 15:08, Tom Lane wrote: >> That's right, and as you say, the planning-speed consequences of doing >> otherwise would be disastrous.
> What do you mean by disastrous ? Let's assume you have ten fact tables, so ten join conditions (11 tables altogether). As-is, the planner considers ten 2-way joins (all between the fact table and one or another dimension table). At the level of 3-way joins, there are 45 possible join relations each of which has just 2 possible sets of input relations. At the level of 4-way joins, there are 120 join relations to consider each of which can be made in only 3 ways. And so on. It's combinatorial but the growth rate is manageable. On the other hand, if we lobotomize the must-have-a-join-condition filter, there are 11 choose 2 possible 2-way joins, or 55. At the level of 3-way joins, there are 165 possible joins, but each of these can be made in 3 different ways from a 2-way join and a singleton. At the level of 4-way joins, there are 330 possible joins, but each of these can be made in 4 ways from a 3-way join and a singleton plus 6 different ways from two 2-way joins. So at this level we're considering something like 3300 different join paths versus 360 in the existing planner. It gets rapidly worse. The real problem is that in most cases, all those extra paths are utterly useless. They would not be less useless just because you're a "data warehouse" or whatever. So I'm uninterested in simply lobotomizing that filter. What would make more sense is to notice that the fact table has an index that's amenable to this type of optimization, and then use that to loosen up the join restrictions, rather than just blindly consider useless joins. I had actually thought that we'd fixed this type of problem in recent versions, and that you should be able to get a plan that would look like Nestloop -> scan dim1 -> Nestloop -> scan dim2 -> indexscan fact table using dim1.a and dim2.b which would arise from developing an indexscan on fact that's parameterized by both other tables, resolving one of those parameterizations via a join to dim2, and then the other one via a join to dim1. I'm not sure offhand why that isn't working in this example. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers