On Thu, 2003-09-18 at 20:20, Tom Lane wrote: > Jenny Zhang <[EMAIL PROTECTED]> writes: > > ... It seems to me that small > > effective_cache_size favors the choice of nested loop joins (NLJ) > > while the big effective_cache_size is in favor of merge joins (MJ). > > No, I wouldn't think that, because a nestloop plan will involve repeated > fetches of the same tuples whereas a merge join doesn't (at least not > when it sorts its inner input, as this plan does). Larger cache > improves the odds of a repeated fetch not having to do I/O. In practice > a larger cache area would also have some effects on access costs for the > sort's temp file, but I don't think the planner's cost model for sorting > takes that into account. I think there is some misunderstanding here. What I meant to say is: >From the plans we got, the optimizer favors the choice of nested loop joins (NLJ) while the big effective_cache_size is in favor of merge joins (MJ). Which we think is not appropriate. We verified that sort_mem has no impact on the plans. Though it would be nice to take that into account. > > As Matt Clark points out nearby, the real question is whether these > planner estimates have anything to do with reality. EXPLAIN ANALYZE > results would be far more interesting than plain EXPLAIN. > > > However, within the same run set consist of 6 runs, we see 2-3% > > standard deviation for the run metrics associated with the multiple > > stream part of the test (as opposed to the single stream part). > > <python> Och, laddie, we useta *dream* of 2-3% variation </python> > BTW, I am a she :-) > > We would like to reduce the variation to be less than 1% so that a > > 2% change between two different kernels would be significant. > > I think this is a pipe dream. Variation in where the data gets laid > down on your disk drive would alone create more than that kind of delta. > I'm frankly amazed you could get repeatability within 2-3%. > Greg is right. The repeatability is due to the aggregate results for a whole test run. As for individual query, the power test(single stream) is very consistent, and the throughput test(multiple streams), any given query execution time varies up to 15% if no swapping. If we set sort_mem too high and swapping occurs, the variation is bigger.
Jenny ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings