On Aug 15, 2006, at 10:40 AM, Jim C. Nasby wrote:
On Mon, Aug 14, 2006 at 11:41:29PM +0300, Hannu Krosing wrote:
??hel kenal p??eval, E, 2006-08-14 kell 18:21, kirjutas Peter
I thought, from watching the list for a while, that the planner
statistics needed were known but that how to gather the statistics
I think over the course of the discussion we have figured out
would like to have cross-column correlation statistics. The precise
mathematical incarnation hasn't been determined yet, as far as I can
see. Collecting the statistics thereafter isn't that hard, but
needs to be a way to not collect an exponential volume of
all column combinations.
I understood that the proposal was to collect only the stats where
needed (determined by user/dba) and use some rule-of-thumb values
collected stats were available.
Yeah, unless someone comes up with some kind of 'magic', I think
to handle every cross-column possibility is a non-starter. IIRC, that
argument is what's stalled cross-column stats every time in the
makes a lot more sense to allow defining what combinations of
need stats for.
After that's done, it'd be easy to then write a script that will tell
the database to collect stats on all multi-column indexes, RI, etc.
the road, the planner could even be made to log (in a machine-readable
format) every time it needs cross-column stats, and that data could be
used to add stats that are needed.
If we're talking about my random neuron firing then I think the
responses have gone off
a bit. My thought was to just tell the planner the statistics that
are of interest.
An example of what I'm thinking would probably be helpful. Let's say
DBA knows, through whatever means at his/her disposal (heck! the
magic you mention)
that column a & column b have some sort of correlation that the
planner can't determine
on its own but can use if it had it. The DBA therefore pokes the
right information into
the planner's statistical tables (or, perhaps, a more human-
manageable one that gets
"compiled" into the planner's stats).
For this to work we'd have to
1. Define the types of statistics that the planner could use in its
it cannot currently (or ever) collect itself. Cross-column
join selectivity, anything that would be useful to the planner.
2. Create a table or other data structure to contain this planner
the planner to use this information.
3. Document what these stats are, and the influence they have in a
for use by DBAs, and how to add the stats to the above table. Mere
mortals can tinker
with this feature at their own peril :-)
Now, when a DBA has information that could steer the planner in the
he/she has a mechanism to do so that does not involve hinting the
specific query. My
hope would be that this information wouldn't go stale as fast as a
query hint would.
Furthermore, the DBA can improve an application's performance without
having to go
into every query it executes.
The planner would look in that table and say Ah! there's information
in here that says
that when a is joined to be it's going to eliminate 90% of my I/O.
Seems to me that such a feature would be a cool knob and address most/
all of the need for
One other possibility for the above information would be just to have
a place for
the planner to save information for itself when it finds a plan to be
over-optimistic or pessimistic.
Hope this blathering makes some kind of sense...
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly