Re: [HACKERS] Ad Hoc Indexes

Justin Mon, 18 Feb 2008 20:04:29 -0800

there are certainly problems with this idea. Figuring out the cost toeither create an index or just do table scan would be pain but notimpossible. The hit to index a table in memory i don't think would bethat bad compare to do 100's to thousands of loops with table scans.

I see more options for the query planner in config file to handle ad-hocindexes

Now to Toms points. The point of Ad Hoc index is they're only alive forthe period time the query is running and only live in memory. Oncethe query completes they die horrible deaths

These temporay indexes will not do a table lock or row lock ever it onlyneeds to view the record to create this index. So the same problemsthat affects 100 table scans would affect create temp index. I thinkthis would help reduce concurrence sense the table is not being scannedthousands of times

The idea of creating one time use indexes has been around for long timeand has showed to be a benefit when dealing with large data sets wherequeries is one time or rarely used and its to much labor and cost tofigure out how to make it faster. This would also reduce IO diskactivity allot if the table can't fit in memory but the index would fitbecause it relativity small in comparison.

Tom Lane wrote:

"Stephen Denne" <[EMAIL PROTECTED]> writes:

The improvement wasn't to the part of the query that had the bad cost estimate, 
it was to the part that was being performed hundreds of times instead of the 
one time the planner estimated.

The planner still thought it was only going to perform a sequential scan of 
your wooper table once. So even if there had been any Ad Hoc Index creation 
code that had been used to consider creating indexes as part of a plan cost 
estimate, it wouldn't have bothered creating any indexes on wooper.


Right.  And even more to the point, if it had gotten the estimate right
and known that the subquery would have been repeated, that would have
(ideally) prompted it to shift to a different plan structure.

As Peter pointed out upthread, the existing hash join logic seems to be
a pretty decent facsimile of an "ad hoc index" --- in fact, the hash
table *is* a hash index for all intents and purposes.  If you tilt your
head at the right angle, a merge join could be seen as comparable to
constructing a couple of ad-hoc btree indexes.  Plus the join code is
not burdened by any overhead that actual index code would be likely to
have, such as niggling worries about crash-safety or concurrent access
to the index.

So in my mind the issue here is not why don't we have ad hoc indexes,
it's why the planner didn't choose a more appropriate join method.
It's clear that faulty rowcount estimation was one part of the answer.
Now it may also be that there are some outright structural limitations
involved due to the "join" arising from a sub-select --- I'm not sure
that the planner *can* generate a bulk join plan from a query expressed
like this one.  But the TODO items that this suggests to me do not
include implementing anything I'd call an ad-hoc index.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Ad Hoc Indexes

Reply via email to