Hi All,

Since it looks like there is not much interest in the patch I will try to 
provide some background to explain why I think it is needed.

We are in the process of migration from an old db platform to PostgreSQL.
Our database is around 10TB big and contains around 10 billion financial 
transactions in a single table.
Each transaction is assigned to an account (column acc_number).

We have partitioned the table BY HASH (acc_number).

A client can query transactions belonging to his accounts using several 
criteria - among them is te xt search.
Queries are of type TOP N (ie ORDER BY … LIMIT ).

The list of accounts that we are querying is provided as a parameter to the 
query.

We have decided to use a single Gist index supporting all queries (reasons 
described in [1]).

There are several problems with Gist usage (but I still think we have no other 
choice) but the most important is
that we cannot use SAOP in our queries - since Gist does not support it the 
planner decides to perform Bitmap Scan
which in turn does not support ORDER BY … LIMIT well because requires Sort.

So when we use “= ANY (array of account numbers) … ORDER BY ...” the plan 
requires reading all records meeting
search criteria and then sort.

As a workaround we have to perform LATERAL joins:

unnest(list of account numbers) AS a(n) LATERAL JOIN (SELECT * FROM … WHERE 
account_number = a.n ORDER BY … LIMIT …) ORDER BY … LIMIT …

It is still bad because requires multiple scans of the same partition if 
account number hashes are the same.

What we really need is for Gist to support “= ANY (…)”.
As Gist index is extensible in terms of queries it supports it is quite easy to 
implement an operator class/family with operator:

||= (text, text[])

that has semantics the same as “= ANY (…)”

With this operator we can write our queries like:

account_number ||= [list of account numbers] AND
account_number = ANY ([list of account numbers]) — redundant for partition 
pruning as it does not understand ||=

and have optimal plans:

Limit
— Merge Append
—— Index scan of relevant partitions

The problem is that now each partition scan is for the same list of accounts.
The “consistent” function cannot assume anything about contents of the table so 
it has to check all elements of the list
and that in turn causes reading unnecessary index pages for accounts not in 
this partition.

What we need is a way for the “consistent” function to be able to pre-process 
input query array and remove elements
that are not relevant for this scan. To do that it is necessary to have enough 
information to read necessary metadata from the catalog.

The proposed patch addresses this need and seems (to me) largely 
uncontroversial as it does not break any existing extensions.

Attached is a patch with consolidated changes (I am pretty new to managing 
patches so previous two were partial and not something shareable, I am afraid).

—
Michal

Attachment: 0001-Pass-key-sk_attno-to-consistent-function-in-gistinde.patch
Description: Binary data

Reply via email to