On Sun, Mar 28, 2010 at 12:19 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> On Sat, Mar 27, 2010 at 4:11 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> I'm not seeing how that would occur or would matter, but the worst case >>> answer is to restart the scan of the SpecialJoinInfos from scratch any >>> time you succeed in doing a join removal. > >> Well, say you have something like > >> SELECT 1 FROM A LEFT JOIN (B LEFT JOIN C ON Pbc) ON Pab > >> I think that the SpecialJoinInfo structure for the join between B and >> C will match the criteria I articulated upthread, but the one for the >> join between A and {B C} will not. If C had not been in the query >> from the begining then we'd have had: > >> SELECT 1 FROM A LEFT JOIN B ON Pab > >> ...under which circumstances the SpecialJoinInfo would match the >> aforementioned criteria. > > I experimented with this and found that you're correct: the tests on the > different SpecialJoinInfos do interact, which I hadn't believed > initially. The reason for this is that when we find out we can remove a > particular rel, we have to remove the bits for it in other relations' > attr_needed bitmaps. In the above example, we first discover we can > remove C. Whatever B vars were used in Pbc will have an attr_needed > set of {B,C}, and that C bit will prevent us from deciding that B can > be removed when we are examining the upper SpecialJoinInfo (which will > not consider C to be part of either min_lefthand or min_righthand). > So we have to remove the C bits when we remove C. > > Attached is an extremely quick-and-dirty, inadequately commented draft > patch that does it along the lines you are suggesting. This was just to > see if I could get it to work at all; it's not meant for application in > anything like its current state. However, I feel a very strong > temptation to finish it up and apply it before we enter beta. As you > noted, this way is a lot cheaper than the original coding, whether one > focuses on the cost of failing cases or the cost when the optimization > is successful. And if we hold it off till 9.1, then any bug fixes that > have to be made in the area later will need to be made against two > significantly different implementations, which will be a real PITA. > > Things that would need to be cleaned up: > > * I left join_is_removable where it was, mainly so that it was easy to > compare how much it changed for this usage (not a lot). I'm not sure > that joinpath.c is an appropriate place for it anymore, though I can't > see any obviously better place either. Any thoughts on that?
I dislike the idea of leaving it in joinpath.c. I don't even think it properly belongs in the path subdirectory since it no longer has anything to do with paths. Also worth thinking about where we would put the logic I pontificated about here: http://archives.postgresql.org/pgsql-hackers/2009-10/msg01012.php > * The removed relation has to be taken out of the set of baserels > somehow, else for example the Assert in make_one_rel will fail. > The current hack is to change its reloptkind to RELOPT_OTHER_MEMBER_REL, > which I think is a bit unclean. We could try deleting it from the > simple_rel_array altogether, but I'm worried that that could result in > dangling-pointer failures, since we're probably not going to go to the > trouble of removing every single reference to the rel from the planner > data structures. A possible compromise is to invent another reloptkind > value that is only used for "dead" relations. +1 for dead relation type. > * It would be good to not count the removed relation in > root->total_table_pages. If we made either of the changes suggested > above then we could move the calculation of total_table_pages down to > after remove_useless_joins and ignore the removed relation(s) > appropriately. Otherwise I'm tempted to just subtract off the relation > size from total_table_pages on-the-fly when we remove it. Sounds good. > * I'm not sure yet about the adjustment of PlaceHolder bitmaps --- we > might need to break fix_placeholder_eval_levels into two steps to get > it right. Not familiar enough with that code to comment. > * Still need to reverse out the now-dead code from the original patch, > in particular the NoOpPath support. Yeah. > Thoughts? I'm alarmed by your follow-on statement that the current code can't handle the two-levels of removable join case. Seems like it ought to form {B C} as a path over {B} and then {A B C} as a path over {A}. Given that it doesn't, we already have a fairly serious bug, so we've either got to put more work into the old implementation or switch to this new one - and I think at this point you and I are both fairly convinced that this is a better way going forward. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers