Re: [PERFORM] Use my (date) index, darn it!

2004-01-05 Thread Christopher Browne
In the last exciting episode, [EMAIL PROTECTED] (John Siracusa) wrote:
> What column(s) should I increase?  Do I have to do anything after increasing
> the statistics, or do I just wait for the stats collector to do its thing?

You have to ANALYZE the table again, to force in new statistics.

And if the index in question is on _just_ the date column, then it is
probably only that date column where the "SET STATISTICS" needs to be
increased.
-- 
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/sap.html
Faith  is the  quality that  enables you  to eat  blackberry jam  on a
picnic without looking to see whether the seeds move. -- DeMara Cabrera

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Use my (date) index, darn it!

2004-01-05 Thread John Siracusa
On 1/5/04 11:45 AM, Christopher Browne wrote:
> It sounds to me as though the statistics that are being collected
> aren't "good enough."  That tends to be a sign that the quantity of
> statistics (e.g. - bins in the histogram) are insufficient.
> 
> This would be resolved by changing the number of bins (default of 10)
> via "ALTER TABLE FOO ALTER COLUMN BAR SET STATISTICS 100" (or some
> other value higher than 10).

I did that, but I wasn't sure what value to use and what column to increase.
I believe I increased the date column itself to 50 or something, but then I
wasn't sure what to do next.  I re-analyzed the table with the date column
set to 50 but it didn't seem to help, so I resorted to clustering.

> Clustering would rearrange the contents of the table, and perhaps make
> the histogram 'more representative.'  Increasing the "SET STATISTICS"
> value will quite likely be even more helpful, and is a lot less
> expensive than clustering the table...

What column(s) should I increase?  Do I have to do anything after increasing
the statistics, or do I just wait for the stats collector to do its thing?

-John


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Use my (date) index, darn it!

2004-01-05 Thread Christopher Browne
After a long battle with technology, [EMAIL PROTECTED] (John Siracusa), an earthling, 
wrote:
> On 1/5/04 1:55 AM, Tom Lane wrote:
>> John Siracusa <[EMAIL PROTECTED]> writes:
>>> Obviously the planner is making some bad choices here.
>> 
>> A fair conclusion ...
>> 
>>> I know that it is trying to avoid random seeks or other scary things
>>> implied by a "correlation" statistic that is not close to 1 or -1, but
>>> it is massively overestimating the hit caused by those seeks and
>>> seemingly not taking into account the size of the table!
>> 
>> You haven't given any evidence to support these conclusions, though.
>
> Well here's what I was basing that theory on: before clustering, the
> correlation for the date column was around 0.3.  After clustering, it was 1,
> and the index was always used.  Does clustering change any other statistics
> other that correlation?  I ran analyze immediately before and after the
> cluster operation.
>
>> Could we see some table schemas, EXPLAIN ANALYZE output, and relevant
>> pg_stats entries for the various cases?
>
> Well, the table is clustered now, so I can't reproduce the situation.  Is
> there any way to "uncluster" a table?  Should I just cluster it on a
> different column?

That would presumably work...

It sounds to me as though the statistics that are being collected
aren't "good enough."  That tends to be a sign that the quantity of
statistics (e.g. - bins in the histogram) are insufficient.

This would be resolved by changing the number of bins (default of 10)
via "ALTER TABLE FOO ALTER COLUMN BAR SET STATISTICS 100" (or some
other value higher than 10).

Clustering would rearrange the contents of the table, and perhaps make
the histogram 'more representative.'  Increasing the "SET STATISTICS"
value will quite likely be even more helpful, and is a lot less
expensive than clustering the table...
-- 
If this was helpful,  rate me
http://www.ntlug.org/~cbbrowne/nonrdbms.html
Rules of  the Evil Overlord  #158. "I will  exchange the labels  on my
folder of  top-secret plans and  my folder of family  recipes. Imagine
the  hero's  surprise when  he  decodes  the  stolen plans  and  finds
instructions for Grandma's Potato Salad."


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Use my (date) index, darn it!

2004-01-05 Thread Tom Lane
John Siracusa <[EMAIL PROTECTED]> writes:
> Is there any way to "uncluster" a table?  Should I just cluster it on a
> different column?

That should work, if you choose one that's uncorrelated with the
previous clustering attribute.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Use my (date) index, darn it!

2004-01-05 Thread John Siracusa
On 1/5/04 1:55 AM, Tom Lane wrote:
> John Siracusa <[EMAIL PROTECTED]> writes:
>> Obviously the planner is making some bad choices here.
> 
> A fair conclusion ...
> 
>> I know that it is trying to avoid random seeks or other scary things
>> implied by a "correlation" statistic that is not close to 1 or -1, but
>> it is massively overestimating the hit caused by those seeks and
>> seemingly not taking into account the size of the table!
> 
> You haven't given any evidence to support these conclusions, though.

Well here's what I was basing that theory on: before clustering, the
correlation for the date column was around 0.3.  After clustering, it was 1,
and the index was always used.  Does clustering change any other statistics
other that correlation?  I ran analyze immediately before and after the
cluster operation.

> Could we see some table schemas, EXPLAIN ANALYZE output, and relevant
> pg_stats entries for the various cases?

Well, the table is clustered now, so I can't reproduce the situation.  Is
there any way to "uncluster" a table?  Should I just cluster it on a
different column?

-John


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Use my (date) index, darn it!

2004-01-05 Thread Tom Lane
John Siracusa <[EMAIL PROTECTED]> writes:
> Obviously the planner is making some bad choices here.

A fair conclusion ...

> I know that it is trying to avoid random seeks or other scary things
> implied by a "correlation" statistic that is not close to 1 or -1, but
> it is massively overestimating the hit caused by those seeks and
> seemingly not taking into account the size of the table!

You haven't given any evidence to support these conclusions, though.
Could we see some table schemas, EXPLAIN ANALYZE output, and relevant
pg_stats entries for the various cases?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html