Hi Gurjeet,
I include pgsql-hackers in this discussion ...
Am 01.11.2006 um 17:38 schrieb Gurjeet Singh:
Hi Kai,
I am working with Simon at EnterpriseDB, and am currently
working on porting
your patch to 8.2 sources. I have done a quick hack to make it work
on 8.2;
please find the modified patch attached (remember it's a quick-n-
dirty hack; it
still needs a cleanup).
The only changes (as yet) that I have done are:
(1) Changed the code according to the change in the linked-
list (List*)
handling across 7.4 and 8.2.
(2) Added support for BitmapAnd, BitmapOr, BitmapHeapScan and
BitmapIndexScan plan-nodes in scan_plan().
The outstanding issues, as of now, as I see them, are:
(1) There are left-over dependencies in pg_depends, that
stops the
table-being-analyzed from getting dropped (probably it'll affect many
other DDLs, I haven't tested though). I am investigating this.
(2) The intermediate indexes that are created are 'real'
indexes, in the
sense that they are actiually created on the disk before
planning/analyzing and are then dropped. I assume this since, you are
calling index_create() which in turn calls index_build(), which in
turn
calls the index's Access-method's build method, which, I assume, will
create the index on disk. Please point out if this is not the case.
You are right - at least an empty index is created. I'm not sure if
the index appears on disk, but this was the easiest way in 7.4 to do
this. However, I agree - it has performance drawbacks and raises
concurreny issues.
This, real-index build, can be counter productive where the
underlying
table is huge. We were thinking of creating actual 'virtual'
indexes, by
utilizing the 'skip_build' parameter of index_create() (introduced in
index.c:1.229). But that would entail calculating the number-of-
pages, average
record-size, and probably some more stats, on our own. And then
putting these
values in the catalog.
Actually, we did exactly this - if you have only an empty index then
you have to estimate this values.
I see that you have declred following two functions in
src/include/catalog/index.h:
extern int2 get_attrType(Oid attrTypId, int2 attrSize, int4
attrTypMod);
extern int4 estimate_indexPages(Relation heapRelation,
IndexInfo *indexInfo);
Looking at these, I suppose that you also worked on some such
calculations.
If still with you, can you share this code with us?
It should be part of the patch - but let me check this.
(3) (If you've lost track, this is the third in the list of
outstanding
issues :). I am concerned about the visibility of the virtual
indexes. If
these indexes are immediately visible to other sessions, then there
is a
strong possibilty that other backends that are in the planning
stage of a
non-explain query, will pickup these indexes and develop their plan
and send
for execution. And you know what hell will break loose if that
happens;
there won't be any data in these indexes!!
Right - that's what I meant above by concurrency issues. Honestly, we
had not enough knowledge about pgsql at this time to do this. I
suppose the right way would be to add a session or transaction id to
the virtual index and let the planner use only virtual indexes from
the same session as the query.
One more thing, we are looking at ways to make it easier for
others too, to
develop their own advisors. So we are looking at the possibility of
making
it plugin based arch, similar to how edb-debugger for pl/pgsql is
being
developed. This will allow others to develop and use their own
advisors for
various Select/DML statements, in the least invasive way.
That's a great idea - it could be helpful for other kind of virtual
objects too, e.g. materialized views, partitions etc. There are
several exits for plugins: the set off indexes which should be
created virtually, the profit assignment to the individual indexes as
well as the way the recommendation is used. For example, we have the
prototype of an online advisor which collects the recommendations
continuously and tries to adapt the current set of real indexes (at
least as an alerter for the DBA).
Lastly, and most importantly, can we move this discussion to
pgsql-hackers?
Done.
So, let me know if there is anything that I can do.
Best regards,
-- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail |
yahoo }.com
Best,
Kai
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings