Estimating bloat for very large tables: what is the state of art?

2022-09-13 Thread Dmitry Astapov
Hi!

I am trying to solve the problem of estimating the table bloat (and index
bloat, though I am mostly focusing on tables at the moment).

After searching far and wide, it seems that the choice is to be made
between two methods:
1. Slow, but very precise pgstattuple
2. Fast, but somewhat imprecise "bloat query" which is attributed to
check_postgres  project, though there
are numerous

variations  in
existence.

pgstattuple is beautiful and accurate but rather slow. If tables are large,
pgstattuple_approx could easily take 5-10 minutes, and if that were the
case, you can see pgstattuple to take 30-60 minutes on the same table
easily.

"Bloat query", on the other hand, is wonderfully fast, but rather
imprecise. It tries to estimate the table data size as pg_class.reltuples *
row_width, where row_width is taken, roughly, to be (24 bytes for the
header + size of NULL map + (sum( (1 - null_frac)*avg_width ) for all
columns in the table, as reported by pg_statistics)).

This, of course, completely ignores the question of padding and so on
tables with a large number of columns the query tends to underestimate the
size of live data by some 10-20% (unless schema was explicitly created to
minimize padding).

I'd like to ask you:
1. Are these indeed two approaches the only options on the table, or am I
missing something?

2. I am considering my own approach where, after looking at pg_attributes
and pg_stats, I am constructing "an example row from this table with no
nulls" (so, max amount of data + max amount of padding) and "an example row
from the table with all the NULLs" (so, as little padding as possible), do
pg_column_size() on both these rows (so that pg_column_size could compute
size+padding for me) and then take an average between them, perhaps
weighted somehow by examining null_frac of table columns. Quick experiments
show that this yields a more accurate estimate of row size for tables with
large numbers of columns than what the "bloat query" does.  Question: can I
do anything better/easier here without sacrificing speed?

-- 
D. Astapov


Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-13 Thread Dmitry Astapov
On Wed, May 12, 2021 at 4:54 PM Tom Lane  wrote:

> Dmitry Astapov  writes:
> > I am trying to understand the behaviour of the query planner regarding
> the
> > push-down of the conditions "through" the join.
>
> I think your mental model is wrong.  What's actually happening here is
> that the planner uses equivalence classes to deduce implied conditions.
> That is, we have the join condition a.adate = b.bdate and then you've
> added the where condition a.adate = '2021-05-12'.  Transitivity implies
> that b.bdate = '2021-05-12', so we deduce that condition and are able
> to apply it at the relation scan of b.  Furthermore, having restricted
> both a.adate and b.bdate to the same constant value at the scan level,
> we no longer need to apply the join condition a.adate = b.bdate at all.
> This is important not only to avoid the (probably minor) inefficiency
> of rechecking the join condition, but because if we believed that all
> three conditions were independently applicable, we'd come out with a
> serious underestimate of the size of the join result.
>

Thank you very much, my mental model was indeed incorrect, and the above is
very helpful.
Am I right in thinking that elimination the join condition is actually
quite important part of the process?
Could it possibly be the main reason for =ANY/(x IN (..)) not to be
optimized the same way?


>
> > In my experiments, I was never able to get an execution plan that "pushes
> > down" any condition apart from (=) through to the right side of the join,
>
> None of the argument sketched above works for non-equality conditions.
> There are some situations where you could probably figure out how to
> use transitivity to deduce some implied condition, but cleaning things
> up so that you don't have redundant conditions fouling up the join
> size estimates seems like a hard problem.
>

I agree about inequality conditions, this problem seems to be rather hard
to tackle in the general case.

Is it still hard when one thinks about =ANY or (column in (val1, val2,
val3, ...)) as well?
I am thinking that =ANY would be a decent workaround for (x BETWEEN a AND
b) in quite a lot of cases, if it was propagated to all the columns in the
equivalence class.



> > Equally surprising is that I was unable to find documentation or past
> > mailing list discussions of this or similar topic, which leads me to
> > believe that I am just not familiar with the proper terminology and can't
> > come up with the right search terms.
>
> src/backend/optimizer/README has a discussion of equivalence classes.
>
Thank you, this gives me a plethora of keywords for further searches.

I realize that it is possibly off-topic here, but what about workarounds
for inequality constraints, joins and views? Maybe you could give me some
pointers here as well?

My tables are large to huge (think OLAP, not OLTP). I found out when I have
a view that joins several (2 to 10) tables on the column that is
semantically the same in all of them (let's say it is ID and we join on
ID), I do not have many avenues to efficiently select from such view for a
list of IDs at the same time.

I could:
1) Do lots of fast queries and union them:
select * from vw where id=ID1 union all select * from vw where id=ID2
., which is only really feasible if the query is generated by the
program

2)expose all ID columns from all the tables used in the view body and do:
select * from vw where id=ANY() and id1=ANY() and id2=ANY() and id3=ANY()
.
This only works well if the view hierarchy is flat (no views on views). If
there are other views that use this use, re-exports of extra columns
quickly snowballs, you might need column renaming if same view ends up
being used more than once through two different dependency paths. Plus
people not familiar with the problem tend to omit "clearly superfluous"
columns from the new views they build on top.

3)forbid views that join tables larger than a certain size/dismantle views
that become inefficient (this only works if the problem is detected fast
enough and the view did not become popular yet)

So all of the workarounds I see in front of me right now are somewhat sad,
but they are necessary, as not doing them means that queries would take
hours or days instead of minutes.

Is there anything better that I have not considered in terms of workarounds?


-- 
D. Astapov


Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-12 Thread Dmitry Astapov
Hi!
I am trying to understand the behaviour of the query planner regarding the
push-down of the conditions "through" the join.

Lets say that I have tables a(adate date, aval text) and b(bdate date, bval
text), and I create a view:

create view v as
   select a.adate, a.aval, b.bval from a join b on (a.adate = b.bdate);

Now, when I do (explain select * from v where adate='2021-05-12') I can see
that condition (= '2021-05-12') is used by the planned for table access to
both a and b.

However, if I use range-like condition (this is probably not a correct
terminology, but I am not familiar with the correct one) like BETWEEN or
(>='2021-05-21'), I will see that planner will use this condition to access
a, but not b. It seems that the type of join (inner or left) does not
really matter.

DB fiddle that illustrates this;
https://www.db-fiddle.com/f/pT2PwUkhJWuX9skWiBWXoL/0

In my experiments, I was never able to get an execution plan that "pushes
down" any condition apart from (=) through to the right side of the join,
which is rather surprising and leads to suboptimal planner estimates and
execution plans whenever view like the above is a part of a bigger query
with more joins on top.

Equally surprising is that I was unable to find documentation or past
mailing list discussions of this or similar topic, which leads me to
believe that I am just not familiar with the proper terminology and can't
come up with the right search terms.

Can you please tell me what is the proper way to describe this
behaviour/phenomenon (so that I can use it as search terms) and/or provide
me with references to the parts of the source code that determines which
conditions would be "pushed down" and which are not?

PS As far as I can see, this behaviour is consistent between versions 9.5,
10, 11, 12 and 13.

-- 
D. Astapov