Tom Lane wrote:
But note that barring backend crash, once all the scans are done it is
guaranteed that the hint will be removed --- somebody will be last to
update the hint, and therefore will remove it when they do heap_endscan,
even if others are not quite done. This is good in the sense that
later-starting backends won't be fooled into starting at what is
guaranteed to be the most pessimal spot, but it's got a downside too,
which is that there will be windows where seqscans are in process but
a newly started scan won't see them. Maybe that's a killer objection.
I think the way the patch is now is better than trying to remove the
hints, but I don't feel strongly either way.
However, I don't think we should try hard to mask the issue. It just
means people are more likely to miss it in testing, and run into it in
production. It's better to find out sooner than later.
It might be a good idea to preserve the order within a transaction,
though that means more code.
When exactly is the hint updated? I gathered from something Heikki said
that it's set after processing X amount of data, but I think it might be
better to set it *before* processing X amount of data. That is, the
hint means "I'm going to be scanning at least <threshold> blocks
starting here", not "I have scanned <threshold> blocks ending here",
which seems like the interpretation that's being used at the moment.
What that would mean is that successive "LIMIT 1000" calls would in fact
all start at the same place, barring interference from other backends.
I don't see how it makes any difference whether you update the hint
before or after processing. Running a LIMIT 1000 query repeatedly will
start from the same place in any case, assuming 1000 tuples fit in the
"report interval", which is 128KB currently.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings