On 23/9/2025 12:20, Frédéric Yhuel wrote:
On 9/22/25 23:15, Andrei Lepikhov wrote:
It may solve at least one issue with the 'dependencies' statistics: a
single number describing the dependency between any two values in the
columns often leads to incorrect estimations, as I see.
For
to get precise statistics.
--
regards, Andrei Lepikhov
On 19/9/2025 03:05, David Rowley wrote:
On Thu, 18 Sept 2025 at 23:55, Andrei Lepikhov wrote:
Perhaps we should start working on introducing this type of callback/ hook?
There's certainly places where you could add a hook that would just
add an unacceptable overhead that we couldn'
On 18/9/2025 13:35, David Rowley wrote:
On Thu, 18 Sept 2025 at 19:55, Andrei Lepikhov wrote:
Imagine if we had a hook within the ExecProcNode. In that scenario, we
could create a trivial extension that would stop the query after, let's
say, 10 minutes of execution and display the cu
e current state. This would
give us more reliable data on estimation and the state of the plan tree.
What are your thoughts?
--
regards, Andrei Lepikhov
On 5/12/25 16:04, Maxim Boguk wrote:
On Mon, May 12, 2025 at 4:48 PM Andrei Lepikhov If I'm not mistaken, it will work with all PG versions that are
currently in support. What do you think?
Such extension would be very useful (and in general - the solution based
on the a
rts drastically.
--
regards, Andrei Lepikhov
On 12/5/2025 16:04, Maxim Boguk wrote:
On Mon, May 12, 2025 at 4:48 PM Andrei Lepikhov It is not hard to write such a tiny extension. As I see, the only extra
stored "C" procedure is needed to set up force-plan-type flag employing
FetchPreparedStatement(). The rest o
ot mistaken, it will work with all PG versions that are
currently in support. What do you think?
--
regards, Andrei Lepikhov
.
--
regards, Andrei Lepikhov
a possibility of changing a single code line and rebuilding
your DB instance to check a conjecture?
--
regards, Andrei Lepikhov
and total cost.
But to be sure, could you send the results of EXPLAIN ANALYZE VERBOSE?
If you also send the data to reproduce the case, we may find the source
of the problem more precisely.
--
regards, Andrei Lepikhov
ivo.substack.com/p/probing-indexes-to-survive-data-skew
[2] https://danolivo.substack.com/p/designing-a-prototype-postgres-plan
[3] https://postgrespro.com/docs/enterprise/16/sr-plan
[4] https://github.com/ossc-db/pg_hint_plan
--
regards, Andrei Lepikhov
ve it. But first, if you execute the ANALYZE
command on these problematic tables, does it fix your issue? May you
live with manual vacuum analysis each time after batch insertion?
If not, may you provide a synthetic reproduction of the case?
--
regards, Andrei Lepikhov
-bbd3-78b2ec65f16c%40enterprisedb.com
--
regards, Andrei Lepikhov
e
forgotten feature [1], which enables extended statistics in join clause
estimations and may push development efforts in that direction.
[1] using extended statistics to improve join estimates
https://www.postgresql.org/message-id/flat/c8c0ff31-3a8a-7562-bbd3-78b2ec65f16c%40enterprisedb.com
ing queries that
need only fractional results.
I may be wrong, but is this a problem of an Append node?
--
regards, Andrei Lepikhov
grade area, you can pick the sr_plan extension, which
is designed to store the plan for a specific query (you can choose
parameterisation on your own) and spread it globally across all
instances' backends.
--
regards, Andrei Lepikhov
puts NestLoop+Memoize at the place of the best path, which is chosen later.
Unfortunately, we can't see a prediction on the number of groups in
Memoize and can only guess the issue.
--
regards, Andrei Lepikhov
, but we still don't have a method practical enough to kick
its out of the trap of local optimum.
--
regards, Andrei Lepikhov
tpcds-1.sql
Description: application/sql
year_total.sql
Description: application/sql
ge for PostgreSQL to estimate such a filter
because of absent information on joint column distribution.
Can you research this way by building extended statistics on these
clauses? It could move the plan to the more optimal direction.
--
regards, Andrei Lepikhov
statistics will be build over any possible combination of
(relation, type). So, you don't need to call CREATE STATISTICS more than
once.
--
regards, Andrei Lepikhov
On 11/8/24 09:45, Ed Sabol wrote:
On Nov 7, 2024, at 9:27 PM, Andrei Lepikhov wrote:
Postgres didn't want Materialize in this example because of the low estimation on its outer subquery. AFAIC, by increasing the *_page_cost's value, you added extra weight to the inner subquery
Wh
((relation = 'located'::text) AND (type = 'document'::text))
You can create extended statistics on the columns 'relation' and 'type'.
These statistics can reduce estimation errors and enable the optimiser
to find a better plan without changing the cost balance.
--
regards, Andrei Lepikhov
p, we need to prove the single result of the subquery beforehand.
Also, playing with AQO, as usual, I found two alternative query plans
that the optimiser can find in the case of more or less correct
cardinality prediction. See these plans in the attachment. I hope they
can be useful for your f
ate_id *
to *e.exec_date_id >= **20241021*
Not sure it would be easy (and make sense) to implement it as a core
feature. But the idea of the extensibility of the clause deduction
system looks spectacular to me.
--
regards, Andrei Lepikhov
, to perform partition pruning on the table 'e', you need to add this
redundant clause.
[1]
https://www.postgresql.org/message-id/flat/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A%40mail.gmail.com
--
regards, Andrei Lepikhov
hs - use -DOPTIMIZER_DEBUG compilation flag.
--
regards, Andrei Lepikhov
n why NestLoop wasn't chosen.
Maybe there is kind of early selectivity estimation error or something
even more deep: specific tuples distribution across blocks of the heap
table.
--
regards, Andrei Lepikhov
nnecessary partitions.
Of course, you can also try pg_hint_plan and force planner to use
MergeJoin or HashJoin in that suspicious case.
--
regards, Andrei Lepikhov
TWEEN or
>= is not?
https://www.postgresql.org/message-id/flat/CAFQUnFhqkWuPCwQ1NmHYrisHJhYx4DoJak-dV%2BFcjyY6scooYA%40mail.gmail.com
--
regards, Andrei Lepikhov
n to see is it a hard transformation limit or mistake in cost
estimation?
--
regards, Andrei Lepikhov
oned table.
--
regards, Andrei Lepikhov
ave different
connection settings for internal and external connections? I mean - from
different networks?
--
regards, Andrei Lepikhov
rks with IndexScan.
--
regards, Andrei Lepikhov
can change
corresponding startup and tuple costs to force such a plan.
--
regards, Andrei Lepikhov
I considered that, but it doesn't apply to this query as there are no
range quals.
David
Don't forget about extended statistics as well - it also could be used.
--
regards, Andrei Lepikhov
the ideal query plan will include parameterised
NestLoop JOINs. Unfortunately, parameterisation in PostgreSQL can't pass
inside a subquery. It could be a reason for new development because
MSSQL can do such a trick, but it is a long way.
You can try to rewrite your schema and query to avoid subqueries in
expressions at all.
I hope this message gave you some insights.
[1] https://github.com/postgrespro/aqo
--
regards, Andrei Lepikhov
se of that need to be estimated more precisely.
I hope this will be helpful for you.
--
regards,
Andrei Lepikhov
Postgres Professional
ral recommendations to resolve this issue, but this
case should be discovered by the core developers.
[1] https://github.com/postgrespro/aqo
--
regards,
Andrei Lepikhov
Postgres Professional
EXPLAIN (1)
===
Unique (cost=10170.87..94163004.90 rows=6400 width=24) (actual
time=1062.7
40 matches
Mail list logo