Re: [HACKERS] Projection while performing joins.
Anagh Lal <[EMAIL PROTECTED]> writes: > ... I am still confused by the following: > In /backend/executor/nodeMergeJoin.c > in ExecMergeJoin() > In the state (the switch case) EXEC_MJ_JOINTUPLES > we still do ExecProject(), what does this do? Well, sure. A join node *must* do a projection, no? It can't simply return either the left or the right input tuple (except in the vanishingly-small fraction of cases where you don't actually need any columns from the right or the left respectively; which are cases that we don't currently bother to optimize). To create a tuple that's not exactly one or the other you must project. > Some food for thought, > Let's ignore the attributes listed in the select > clause > and work only with the where clause (join condition) > attributes. And as a back reference store the > tupleid of the original whole tuple in the "working" > tuple. At the final output stage perform a lookup to > retrieve the select clause attributes of only the > qualifying tuple. Thus enabling us to work with really > small sized data. > worth trying out? Not sure there's a lot of traction here. In many cases, the bottom-level scan gets advanced one or more rows before the top-level nodes can pop out a result. (Consider GROUP BY as an example.) I don't see the advantage of adding bookkeeping/copying for past rows in order to avoid copying current-row data around. But feel free to prove me wrong ;-) regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Projection while performing joins.
Hi, Two parts to the mail 1) > We already do --- the scan nodes project out only > the needed columns. ok..thats great. I tried looking for what you are saying in the source code... [before and after doing a cvs update].. but I am still confused by the following: In /backend/executor/nodeMergeJoin.c in ExecMergeJoin() In the state (the switch case) EXEC_MJ_JOINTUPLES we still do ExecProject(), what does this do? (this is what prompted my incorrect assumption in the first place). I was able to trace the ExecProject() which takes place (due to the SeqScan node) in the ExecScan function of /executor/execScan.c (I made an error in assuming the tuple returned by the function passed as an argument to ExecScan was being returned directly )... so now i see what you were saying in the mail but am not clear about why we have a ExecProject() higher up in the join nodes..its there in the NestedLoop node too. part 2) > As you indicate, that's probably a net loss > when there's a Sort node directly above the scan > node. Needs more thought... Some food for thought, Let's ignore the attributes listed in the select clause and work only with the where clause (join condition) attributes. And as a back reference store the tupleid of the original whole tuple in the "working" tuple. At the final output stage perform a lookup to retrieve the select clause attributes of only the qualifying tuple. Thus enabling us to work with really small sized data. worth trying out? Thanks Anagh Lal __ Do you Yahoo!? Yahoo! Shopping - Send Flowers for Valentine's Day http://shopping.yahoo.com ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Projection while performing joins.
Anagh Lal <[EMAIL PROTECTED]> writes: > Why do we not just store the attributes required in > the join (i.e. those in the join qual conditions and > the ones in the select list) and then perform sorting > and retrieval on these tuples rather than on the > possibly larger tuples with more attributes which we > project out in the end anyway. We already do --- the scan nodes project out only the needed columns. At least, that's true in released versions. Just a few days ago I put a hack into CVS tip to skip the projection step for a scan node that's not topmost in the plan. As you indicate, that's probably a net loss when there's a Sort node directly above the scan node. Needs more thought... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]