Re: a wrong index choose when statistics is out of date

2024-04-27 Thread Andy Fan
Andy Fan writes: > Hello everyone, > >> After some more thoughts about the diference of the two ideas, then I >> find we are resolving two different issues, just that in the wrong index >> choose cases, both of them should work generally. > > Here is the formal version for the attribute

Re: a wrong index choose when statistics is out of date

2024-03-30 Thread Andy Fan
Hello everyone, > After some more thoughts about the diference of the two ideas, then I > find we are resolving two different issues, just that in the wrong index > choose cases, both of them should work generally. Here is the formal version for the attribute reloptions direction. commit

Re: a wrong index choose when statistics is out of date

2024-03-13 Thread Andy Fan
> > Having had the same problem for a long time, I've made an attempt and > invented a patch that probes an index to determine whether the estimated > constant is within statistics' scope. > I remember David's remark on the overhead problem, but I don't argue it > here. This patch is on the

Re: a wrong index choose when statistics is out of date

2024-03-12 Thread Andrei Lepikhov
On 8/3/2024 18:53, Andy Fan wrote: I just reviewed the bad queries plan for the past half years internally, I found many queries used the Nested loop which is the direct cause. now I think I find out a new reason for this, because the missed optimizer statistics cause the rows in outer relation

Re: a wrong index choose when statistics is out of date

2024-03-08 Thread Andy Fan
After some more thoughts about the diference of the two ideas, then I find we are resolving two different issues, just that in the wrong index choose cases, both of them should work generally. Your idea actually adding some rule based logic named certainty_factor, just the implemenation is

Re: a wrong index choose when statistics is out of date

2024-03-08 Thread Andrei Lepikhov
On 7/3/2024 17:32, David Rowley wrote: On Thu, 7 Mar 2024 at 21:17, Andrei Lepikhov wrote: I would like to ask David why the var_eq_const estimator doesn't have an option for estimation with a histogram. Having that would relieve a problem with skewed data. Detecting the situation with

Re: a wrong index choose when statistics is out of date

2024-03-08 Thread Andy Fan
David Rowley writes: > On Thu, 7 Mar 2024 at 23:40, Andy Fan wrote: >> >> David Rowley writes: >> > If you don't want the planner to use the statistics for the column why >> > not just do the following? >> >> Acutally I didn't want the planner to ignore the statistics totally, I >> want the

Re: a wrong index choose when statistics is out of date

2024-03-07 Thread David Rowley
On Thu, 7 Mar 2024 at 23:40, Andy Fan wrote: > > David Rowley writes: > > If you don't want the planner to use the statistics for the column why > > not just do the following? > > Acutally I didn't want the planner to ignore the statistics totally, I > want the planner to treat the "Const" which

Re: a wrong index choose when statistics is out of date

2024-03-07 Thread Andy Fan
Andrei Lepikhov writes: > On 5/3/2024 19:56, Andy Fan wrote: >> I think it is OK for a design review, for the implementaion side, the >> known issue includes: >> 1. Support grap such infromation from its parent for partitioned table >> if the child doesn't have such information. >> 2. builtin

Re: a wrong index choose when statistics is out of date

2024-03-07 Thread Andy Fan
David Rowley writes: > On Wed, 6 Mar 2024 at 02:09, Andy Fan wrote: >> This patch introduces a new attoptions like this: >> >> ALTER TABLE t ALTER COLUMN col set (force_generic=true); >> >> Then selfunc.c realizes this and ignore the special Const value, then >> average

Re: a wrong index choose when statistics is out of date

2024-03-07 Thread David Rowley
On Thu, 7 Mar 2024 at 21:17, Andrei Lepikhov wrote: > I would like to ask David why the var_eq_const estimator doesn't have an > option for estimation with a histogram. Having that would relieve a > problem with skewed data. Detecting the situation with incoming const > that is out of the covered

Re: a wrong index choose when statistics is out of date

2024-03-07 Thread David Rowley
On Wed, 6 Mar 2024 at 02:09, Andy Fan wrote: > This patch introduces a new attoptions like this: > > ALTER TABLE t ALTER COLUMN col set (force_generic=true); > > Then selfunc.c realizes this and ignore the special Const value, then > average selectivity is chosen. This fall into

Re: a wrong index choose when statistics is out of date

2024-03-07 Thread Andrei Lepikhov
On 5/3/2024 19:56, Andy Fan wrote: I think it is OK for a design review, for the implementaion side, the known issue includes: 1. Support grap such infromation from its parent for partitioned table if the child doesn't have such information. 2. builtin document and testing. Any feedback is

Re: a wrong index choose when statistics is out of date

2024-03-05 Thread Andy Fan
Hi, > >> We should do anything like add column options in the meantime. Those >> are hard to remove once added. > > I will try it very soon. Attached is a PoC version. and here is the test case. create table t(a int, b int, c int) with (autovacuum_enabled=off); create index on t(a, b); create

Re: a wrong index choose when statistics is out of date

2024-03-04 Thread Andy Fan
David Rowley writes: > On Tue, 5 Mar 2024 at 00:37, Andy Fan wrote: >> >> David Rowley writes: >> > I don't think it would be right to fudge the costs in any way, but I >> > think the risk factor for IndexPaths could take into account the >> > number of unmatched index clauses and increment

Re: a wrong index choose when statistics is out of date

2024-03-04 Thread David Rowley
On Tue, 5 Mar 2024 at 00:37, Andy Fan wrote: > > David Rowley writes: > > I don't think it would be right to fudge the costs in any way, but I > > think the risk factor for IndexPaths could take into account the > > number of unmatched index clauses and increment the risk factor, or > >

Re: a wrong index choose when statistics is out of date

2024-03-04 Thread Tomas Vondra
On 3/4/24 06:33, David Rowley wrote: > On Sun, 3 Mar 2024 at 20:08, Andy Fan wrote: >> The issue can be reproduced with the following steps: >> >> create table x_events (.., created_at timestamp, a int, b int); >> >> create index idx_1 on t(created_at, a); >> create index idx_2 on t(created_at,

Re: a wrong index choose when statistics is out of date

2024-03-04 Thread Andy Fan
Andrei Lepikhov writes: > On 3/3/2024 14:01, Andy Fan wrote: >> 1. We can let the user define the column as the value is increased day by >> day. the syntax may be: >> ALTER TABLE x_events ALTER COLUMN created_at ALWAYS_INCREASED. >> then when a query like 'create_at op const', the

Re: a wrong index choose when statistics is out of date

2024-03-04 Thread Andy Fan
David Rowley writes: > On Sun, 3 Mar 2024 at 20:08, Andy Fan wrote: >> The issue can be reproduced with the following steps: >> >> create table x_events (.., created_at timestamp, a int, b int); >> >> create index idx_1 on t(created_at, a); >> create index idx_2 on t(created_at, b); >> >>

Re: a wrong index choose when statistics is out of date

2024-03-03 Thread David Rowley
On Mon, 4 Mar 2024 at 19:20, Andrei Lepikhov wrote: > Could we use the trick with the get_actual_variable_range() to find some > reason and extrapolate histogram data out of the boundaries when an > index shows us that we have min/max outside known statistics? > Because it would be used for the

Re: a wrong index choose when statistics is out of date

2024-03-03 Thread Andrei Lepikhov
On 4/3/2024 12:33, David Rowley wrote: [1] https://www.postgresql.org/message-id/CAApHDvo2sMPF9m%3Di%2BYPPUssfTV1GB%3DZ8nMVa%2B9Uq4RZJ8sULeQ%40mail.gmail.com Thanks for the link! Could we use the trick with the get_actual_variable_range() to find some reason and extrapolate histogram data out

Re: a wrong index choose when statistics is out of date

2024-03-03 Thread David Rowley
On Sun, 3 Mar 2024 at 20:08, Andy Fan wrote: > The issue can be reproduced with the following steps: > > create table x_events (.., created_at timestamp, a int, b int); > > create index idx_1 on t(created_at, a); > create index idx_2 on t(created_at, b); > > query: > select * from t where

Re: a wrong index choose when statistics is out of date

2024-03-03 Thread Andrei Lepikhov
On 3/3/2024 14:01, Andy Fan wrote: 1. We can let the user define the column as the value is increased day by day. the syntax may be: ALTER TABLE x_events ALTER COLUMN created_at ALWAYS_INCREASED. then when a query like 'create_at op const', the statistics module can treat it as

a wrong index choose when statistics is out of date

2024-03-02 Thread Andy Fan
The issue can be reproduced with the following steps: create table x_events (.., created_at timestamp, a int, b int); create index idx_1 on t(created_at, a); create index idx_2 on t(created_at, b); query: select * from t where create_at = current_timestamp and b = 1; index (created_at, a)