Re: [HACKERS] Projection while performing joins.

2003-02-11 Thread Tom Lane
Anagh Lal <[EMAIL PROTECTED]> writes:
> ... I am still confused by the following:
> In /backend/executor/nodeMergeJoin.c
> in ExecMergeJoin()
> In the state (the switch case)  EXEC_MJ_JOINTUPLES
> we still do ExecProject(), what does this do?

Well, sure.  A join node *must* do a projection, no?  It can't simply
return either the left or the right input tuple (except in the
vanishingly-small fraction of cases where you don't actually need any
columns from the right or the left respectively; which are cases that
we don't currently bother to optimize).  To create a tuple that's not
exactly one or the other you must project.

> Some food for thought,
> Let's ignore the attributes listed in the select
> clause
> and work only with the where clause (join condition)
> attributes. And as a back reference store the 
> tupleid of the original whole tuple in the "working"
> tuple. At the final output stage perform a lookup to
> retrieve the select clause attributes of only the
> qualifying tuple. Thus enabling us to work with really
> small sized data. 
> worth trying out?

Not sure there's a lot of traction here.  In many cases, the
bottom-level scan gets advanced one or more rows before the top-level
nodes can pop out a result.  (Consider GROUP BY as an example.)
I don't see the advantage of adding bookkeeping/copying for past
rows in order to avoid copying current-row data around.

But feel free to prove me wrong ;-)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Projection while performing joins.

2003-02-11 Thread Anagh Lal
Hi,
Two parts to the mail
1)
> We already do --- the scan nodes project out only
> the needed columns.
ok..thats great.
I tried looking for what you are saying in the source
code... [before and after doing a cvs update].. but I
am still confused by the following:
In /backend/executor/nodeMergeJoin.c
in ExecMergeJoin()
In the state (the switch case)  EXEC_MJ_JOINTUPLES
we still do ExecProject(), what does this do?
(this is what prompted my incorrect assumption in the
first place).

I was able to trace the ExecProject() which takes
place (due to the SeqScan node) in the 
ExecScan function of /executor/execScan.c 
(I made an error in assuming the tuple returned
by the function passed as an argument to ExecScan was
being returned directly )...
so now i see what you were saying in the mail but
am not clear about why we have a ExecProject() higher
up in the join nodes..its there in the NestedLoop node
too. 

part 2)
>  As you indicate, that's probably a net loss
> when there's a Sort node directly above the scan
> node.  Needs more thought...

Some food for thought,
Let's ignore the attributes listed in the select
clause
and work only with the where clause (join condition)
attributes. And as a back reference store the 
tupleid of the original whole tuple in the "working"
tuple. At the final output stage perform a lookup to
retrieve the select clause attributes of only the
qualifying tuple. Thus enabling us to work with really
small sized data. 
worth trying out?

Thanks
Anagh Lal







__
Do you Yahoo!?
Yahoo! Shopping - Send Flowers for Valentine's Day
http://shopping.yahoo.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Projection while performing joins.

2003-02-10 Thread Tom Lane
Anagh Lal <[EMAIL PROTECTED]> writes:
>  Why do we not just store the attributes required in
> the join (i.e. those in the join qual conditions and
> the ones in the select list) and then perform sorting
> and retrieval on these tuples rather than on the
> possibly larger tuples with more attributes which we
> project out in the end anyway.

We already do --- the scan nodes project out only the needed columns.

At least, that's true in released versions.  Just a few days ago I put
a hack into CVS tip to skip the projection step for a scan node that's
not topmost in the plan.  As you indicate, that's probably a net loss
when there's a Sort node directly above the scan node.  Needs more
thought...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]