Dan Harris <[EMAIL PROTECTED]> writes:
> I wonder why the estimates were so far off the first time? This table
> has been ANALYZED regularly ever since creation.
Probably just that you need a bigger sample size for such a large table.
We've been arguing ever since 7.2 about what the default stat
Tom Lane wrote:
What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.0017 * 45e6), which is what's scaring
the planner here ;-). I think those frequencies are artificially high
though. The default statistics sample size is 3000 rows (300 *
statist
Dan Harris <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> What does the pg_stats entry for eventactivity.incidentid
>> contain?
> {P043190299,P051560740,P052581036,P052830218,P053100679,P053190889,P060370845,P042070391,P042690319,P043290117}
>
> |
> {0.0017,0.0017,0.0017,0.001
Tom Lane wrote:
So it's estimating 5775 cost units per probe into eventactivity, which
is pretty high --- it must think that a lot of rows will be retrieved by
the index (way more than the 20 or so it thinks will get past the filter
condition).
What does the pg_stats entry for eventactivity
Dan Harris <[EMAIL PROTECTED]> writes:
> Yes, eventactivity.incidentid is indexed. The datatype is varchar(40).
> Although, by checking this, I noticed that k_h.incidentid was
> varchar(100). Perhaps the difference in length between the keys caused
> the planner to not use the fastest method?
Tom Lane wrote:
That's very strange --- the estimated cost of the seqscan is high enough
that the planner should have chosen a nestloop with inner indexscan on
the big table. I'm not sure about the join-order point, but the hash
plan for the first join seems wrong in any case.
Um, you do have a
Dan Harris <[EMAIL PROTECTED]> writes:
> I have a query that is intended to select from multiple "small tables"
> to get a limited subset of "incidentid" and then join with a "very
> large" table. One of the operations will require a sequential scan, but
> the planner is doing the scan on the v