Jeff Janes <jeff.ja...@gmail.com> writes: > On Dec 3, 2017 15:31, "Tom Lane" <t...@sss.pgh.pa.us> wrote: >> Jeff Janes <jeff.ja...@gmail.com> writes: >>> But I do see that ties within the logical order of the column values are >>> broken to agree with the physical order. That is wrong, right? Is there >>> any argument that this is desirable?

## Advertising

>> Uh ... what do you propose doing instead? We'd have to do something with >> ties, and it's not so obvious this way is wrong. > Let them be tied. If there are 10 distinct values, number the values 0 to > 9, and all rows of a given distinct values get the same number for the > logical order axis. > Calling the correlation 0.8 when it is really 0.0 seems obviously wrong to > me. Although if we switched btree to store duplicate values with tid as a > tie breaker, then maybe it wouldn't be as obviously wrong. I thought some more about this. What we really want the correlation stat to do is help us estimate how randomly an index-ordered scan will access the heap. If the values we've sampled are all unequal then there's no particular issue. However, if we have some group of equal values, we do not really know what order an indexscan will visit them in. The existing correlation calculation is making the *most optimistic possible* assumption, that such a group will be visited exactly in heap order --- and that assumption isn't too defensible. IIRC, a freshly built b-tree will behave that way, because the initial sort of a btree breaks ties using heap TIDs; but we don't maintain that property during later insertions. In any case, given that we do this calculation without regard to any specific index, we can't reasonably expect to model exactly what the index will do. It would be best to adopt some middle-of-the-road assumption about what the heap visitation order will be for a set of duplicate values: not exactly heap order, but I think we should not use a worst-case assumption either, since the btree may retain some amount of its initial ordering. BTW, I disagree that "correlation = zero" is the right answer for this particular example. If the btree is freshly built, then an index-order scan would visit all the heap pages in sequence to fetch "f" rows, and then visit them all in sequence again to fetch "t" rows, which is a whole lot better than the purely random access that zero correlation implies. So I think 0.8 or so is actually a perfectly reasonable answer when the index is fresh. The trouble is just that it'd behoove us to derate that answer somewhat for the probability that the index isn't fresh. My first thought for a concrete fix was to use the mean position of a group of duplicates for the purposes of the correlation calculation, but on reflection that's clearly wrong. We do have an idea, from the data we have, whether the duplicates are close together in the heap or spread all over. Using only mean position would fail to distinguish those cases, but really we'd better penalize the spread-all-over case. I'm not sure how to do that. Or we could leave this calculation alone and try to move towards keeping equal values in TID order in btrees. Not sure about the downsides of that, though. regards, tom lane