On Mon, Jan 9, 2017 at 8:05 PM, Patrick B <patrickbake...@gmail.com> wrote:

> ​3,581​ individual pokes into the heap to confirm tuple visibility and
>> apply the deleted filter - that could indeed take a while.
>> David J.
>
>
> I see.. The deleted column is:
>
> deleted boolean
>
> Should I create an index for that? How could I improve this query?
>
>
> Does it execute as slowly when you run it for a 2nd time?
>
>
> No, it doesn't. I think it's because of cache?
>

​Correct - your first execution swallows disk I/O​.


>
>
> I would think because of the NOT "deleted" clause. Which is interesting,
>> because that's a column which you conveniently didn't include in the
>> definition below.
>
>
> My mistake.
>
>
> Would an Index be sufficient to solve the problem?
>
>
​A lot would depend on the selectivity of "deleted"... you are tossing less
than 1/3rd of the rows here so probably not that helpful - though a partial
index over deleted = false would benefit this specific query.

If this query dominates your non-PK usage of this table you could consider
clustering the table on client_id - that would reduce the I/O hit at the
expense of increased maintenance.

Really, though, the query you are showing is poorly written - so I am
assuming that it is standing in for something else.  If that's so you have
provided zero context for meaningful opinions to be rendered.

Queries against the supposedly large customer table, on the
high-cardinality client_id field, which result in many matching rows is
going to be a physical problem if the data doesn't remain in cache.
CLUSTER can help a bit in that situation.  Otherwise your solutions are
more likely to by physical and not logical (model/SQL).

David J.

Reply via email to