Hi,

On 11/15/18 6:41 AM, Alexander Kuzmenkov wrote:
But having this logic inside _bt_next means that it will make a non-skip index
only scan a bit slower, am I right?

Correct.

Well, it depends on how the skip scan is implemented. We don't have to make normal scans slower, because skip scan is just a separate thing.

My main point was that current implementation is good as a proof of concept, but it is inefficient for some data and needs some unreliable planner logic to work around this inefficiency. And now we also have execution-time fallback because planning-time fallback isn't good enough. This looks overly complicated. Let's try to design an algorithm that works well regardless of the particular data and doesn't need these heuristics. It should be possible to do so for btree.

Of course, I understand the reluctance to implement an entire new type of btree traversal. Anastasia Lubennikova suggested a tweak for the current method that may improve the performance for small groups of equal values. When we search for the next unique key, first check if it is contained on the current btree page using its 'high key'. If it is, find it on the current page. If not, search from the root of the tree like we do now. This should improve the performance for small equal groups, because there are going to be several such groups on the page. And this is exactly where we have the regression compared to unique + index scan.


Robert suggested something similar in [1]. I'll try and look at that when I'm back from my holiday.

By the way, what is the data for which we intend this feature to work? Obviously a non-unique btree index, but how wide are the tuples, and how big the equal groups? It would be good to have some real-world examples.


Although my primary use-case is int I agree that we should test the different data types, and tuple widths.

[1] https://www.postgresql.org/message-id/CA%2BTgmobb3uN0xDqTRu7f7WdjGRAXpSFxeAQnvNr%3DOK5_kC_SSg%40mail.gmail.com

Best regards,
 Jesper

Reply via email to