Yesterday Jignesh Shah presented his extensive benchmark results comparing 8.4-beta1 with 8.3.7 at PGCon: http://blogs.sun.com/jkshah/entry/pgcon_2009_performance_comparison_of

While most cases were dead even or a modest improvement, his dbt-2 results suggest a 15-20% regression in 8.4. Changing the default_statistics_taget to 100 was responsible for about 80% of that regression. The remainder was from the constraint_exclusion change. That 80/20 proportion was mentioned in the talk but not in the slides. Putting both those back to the 8.3 defaults swapped things where 8.4b1 was ahead by 5% instead. (Note that all of the later benchmarks in his slides continued to use the default parameters, that change was only tested with that specific workload)

The situation where the stats target being so low hurts things the most are the data warehouse use cases. Josh Berkus tells me that his latest DW testing suggests that the 10->100 increase turns out to be insufficient anyway; 400+ is the range you really need that to be in. I did a quick survey of some other community members who work in this space and that experience is not unique. Josh has some early tools that tackle this problem by adjusting the stats target only when it's critical--on indexed columns for example. I'm going to work with him to help get those polished, and to see if we can replicate some of those cases via a public benchmark.

The bump from 10 to 100 was supported by microbenchmarks that suggested it would be tolerable. That doesn't seem to be reality here though, and it's questionable whether this change really helps the people who need to fool with the value the most. This sort of feedback is exactly why it made sense to try this out during the beta cycle. But unless someone has some compelling evidence to the contrary, it looks like the stats target needs to go back to a lower value. I think the best we can do here is to improve the documentation about this parameter and continue to work on tuning guides and tools to help people set it correctly.

As for the change to constraint_exclusion, the regression impact there is much less severe and the downside of getting it wrong is pretty bad. Rather than reverting it, the ideal response to that might be to see if it's possible to improve the "partition" code path. But as I'm not going to volunteer to actually do that, I really don't get a vote here anyway.

--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to