Robert Haas <robertmh...@gmail.com> writes: > On Tue, Dec 26, 2023 at 10:23 PM Tom Lane <t...@sss.pgh.pa.us> wrote: >> I think it's a fool's errand to even try to separate different sort >> column orderings by cost. We simply do not have sufficiently accurate >> cost information. The previous patch in this thread got reverted because >> of that (well, also some implementation issues, but mostly that), and >> nothing has happened to make me think that another try will fare any >> better.
> I'm late to the party, but I'd like to better understand what's being > argued here. What I am saying is that we don't have sufficiently accurate cost information to support the sort of logic that got committed and reverted before. I did not mean to imply that it's not possible to have such info, only that it is not present today. IOW, what I'm saying is that if you want to write code that tries to make a cost-based preference of one sorting over another, you *first* need to put in a bunch of legwork to create more accurate cost numbers. Trying to make such logic depend on the numbers we have today is just going to result in garbage in, garbage out. Sadly, that's not a small task: * We'd need to put effort into assigning more realistic procost values --- preferably across the board, not just comparison functions. As long as all the comparison functions have procost 1.0, you're just flying blind. * As you mentioned, there'd need to be some accounting for the likely size of varlena inputs, and especially whether they might be toasted. * cost_sort knows nothing of the low-level sort algorithm improvements we've made in recent years, such as abbreviated keys. That's a lot of work, and I think it has to be done before we try to build infrastructure on top, not afterwards. regards, tom lane