Re: [HACKERS] Implementing a new Join Algorithm

2003-01-05 Thread Sailesh Krishnamurthy
> "Tom" == Tom Lane <[EMAIL PROTECTED]> writes:

Tom> Anagh Lal <[EMAIL PROTECTED]> writes:
>> I am trying to test a new join algorithm by implementing it on
>> Postgresql.  It would be great if you could give me some start
>> off pointers so as to where all in the source code I will have
>> to make changes.

Tom> Lots of places ;-).

Tom> You will find that a full-text indexing tool is an essential
Tom> aid for working with the Postgres source code.  I am partial

I've had great success with cscope. Especially with the XEmacs
interfaces. 

I think in fact the easy part is the executor stuff because it's
nicely localized. Getting the planner to choose a new option is a
little messier. 

As part of the TelegraphCQ project we've implemented what smells like
a symmetric hash join. Actually it's more like an N-way symmetric hash
join. This is done using the new SteM operator which can work in a
classical iterator model plan. With the Eddy operator we can avoid
static dataflows.

Since we were a bit chary of jumping headlong into the optimizer, we
"cheated". What we do is let the postgres optimizer do its thing and
produce a plan with Hash Joins. Then we walk the plan and replace Hash
Join nodes with our new SteM nodes. We call our conversion code in
pg_exec_query_string() right after pg_plan_query()

It's more complicated than that because we're actually implementing
the Eddy operator. But if your goal is just to try out your nice new
join algorithm, this would probably work and be a quick fix to get you
started.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Implementing a new Join Algorithm

2003-01-05 Thread Tom Lane
Anagh Lal <[EMAIL PROTECTED]> writes:
> I am trying to test a new join algorithm by
> implementing it on Postgresql. 
> It would be great if you could give me some start off
> pointers so as to where all in the source code I will
> have to make changes.

Lots of places ;-).

You will find that a full-text indexing tool is an essential aid for
working with the Postgres source code.  I am partial to "glimpse" but
others use other things (searching for "glimpse" in the pghackers list
archives should turn up previous discussions of useful tools).

Once you've got one, looking for files mentioning both "merge" and
"hash" should locate all the incidental places you will need to hit.

The primary work will certainly be in adding an executor/nodeNewjoin.c
file and in teaching the planner how to plan the join type.  I think
most of the incidental work will be associated with creating new plan,
executor-state, and path node types and adding support code for these
where needed.  This should be largely boilerplate that you can produce
by emulating the existing join types.

Frankly I'd consider a new join type to be an overly ambitious target
for a person's first venture into the Postgres backend.  You'd probably
find it a good idea to tackle some smaller project(s) first ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster