Re: pb with join plan

2023-06-27 Thread Marc Millas
On Tue, Jun 27, 2023 at 8:12 PM Kirk Wolak  wrote:

> On Wed, Jun 21, 2023 at 12:10 PM Marc Millas 
> wrote:
>
>> Marc MILLAS
>> On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <
>> tomas.von...@enterprisedb.com> wrote:
>>
>>> On 6/21/23 00:26, Marc Millas wrote:
>>> >
>>> >
>>> > On Tue, Jun 20, 2023 at 11:19 PM David Rowley >> > > wrote:
>>> >
>>> > On Wed, 21 Jun 2023 at 08:34, Marc Millas >> > > wrote:
>>> > >
>>> > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
>>> > mailto:dgrowle...@gmail.com>> wrote:
>>> > >>
>>> > >> On Wed, 21 Jun 2023 at 07:42, Marc Millas <
>>> marc.mil...@mokadb.com
>>> > > wrote:
>>> > >> > But if I do the same with clause one OR clause 2, I have to
>>> > kill the request after an hour, seeing the filesystem showing more
>>> > than 140 Mb of increased usage.
>>>
>>> It's a bit weird the "victor" table is joined seemingly without any join
>>> conditions, leading to a cross join (which massively inflates the cost
>>> for joins above it). Maybe the anonymized plan mangles it somehow.
>>>
>>
>> So I did try to simplify my pb.
>> I create a table with the result of the first 3 joins.
>> That table do have 15M lines. all tables have been vacuum analyze
>>
>> Now if I do an explain analyze of a simple join between that table and my
>> original table 4
>> using a simple = clause, I get a result in one second (around). and the
>> planner guesses for rows seems in line with the observed values .
>> if I use a substr(table1.a)= table2.b, the explain analyze get a result
>> in 21 seconds and the planner estimates a 65M rows result set while the
>> observed is 330 k rows
>> so here its 20 times slower and the discrepency between planner rows
>> guess and reality is a 200 ratio.
>>
>> Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
>> then... I kill the query after a quarter an hour without any answer.
>> if I try to just explain the query, the planner rows guess becomes more
>> than 2 Billions
>> the extremely simple query and plan are here, without automatic
>> obfuscation
>> https://explain.depesz.com/s/b8Ll
>>
>
> First, I am not sure why you cannot send us the explain analyze.  But
> moving on...
>
Kirk, the explain analyze, with the SQL query is directly accessible  on
the explain.depesz link .

>
> substr() is a function that mutilates a value such that the index becomes
> useless...
> If you are looking for the LEFT() of the value, then an INDEX can be used.
> I have COLLATION "C" and when I query:
> WHERE fld like  fld_b||"%"
>

there are NO indexes on those columns. One of the reasons is that the
simplest index on one column is 380 GB on disk
So to put indexes on each criteria, I must add around 1 TB of disk  just
for ttt
the full scan is not a problem. Its fast.. The problem is the nested loop
which do compare each of the 15M lines of ttt to each of the 30K lines of
inc_pha_r.
its an operation done 450 000 000 000 times. so if each comparison is 1
microsecond long, the nested loop is 125 hours long.
And I am not sure that the comparison is done in 1 microsecond...

>
> The optimizer constructs a query that uses the index on "fld"...
> But when I try:
>
> WHERE fld like CONCAT_WS("", fld_b,"%")
> It doesn't use the index version. (because the function call is too
> complicated to see through)
>
> When using functions in where clauses, indexes either have to be made on
> those functions, or often times the index cannot be used.
>
> BTW, I noted the COLLATION.  That turned out to be important, because my
> first DB test did NOT use that collation, and the result
> of the LIKE was the non-indexed version...
>
> I hope you find something useful in here.
>
Thanks for trying

>
> Also, WHERE fld <> 72...  (unless you have a heavily skewed set of
> statistics, I read that as.  SCAN everything, and check later,
> because this should filter very few rows), whereas fld = 72 will be
> blazingly fast.
>
> Kirk
>


Re: pb with join plan

2023-06-27 Thread Kirk Wolak
On Wed, Jun 21, 2023 at 12:10 PM Marc Millas  wrote:

> Marc MILLAS
> On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <
> tomas.von...@enterprisedb.com> wrote:
>
>> On 6/21/23 00:26, Marc Millas wrote:
>> >
>> >
>> > On Tue, Jun 20, 2023 at 11:19 PM David Rowley > > > wrote:
>> >
>> > On Wed, 21 Jun 2023 at 08:34, Marc Millas > > > wrote:
>> > >
>> > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
>> > mailto:dgrowle...@gmail.com>> wrote:
>> > >>
>> > >> On Wed, 21 Jun 2023 at 07:42, Marc Millas <
>> marc.mil...@mokadb.com
>> > > wrote:
>> > >> > But if I do the same with clause one OR clause 2, I have to
>> > kill the request after an hour, seeing the filesystem showing more
>> > than 140 Mb of increased usage.
>>
>> It's a bit weird the "victor" table is joined seemingly without any join
>> conditions, leading to a cross join (which massively inflates the cost
>> for joins above it). Maybe the anonymized plan mangles it somehow.
>>
>
> So I did try to simplify my pb.
> I create a table with the result of the first 3 joins.
> That table do have 15M lines. all tables have been vacuum analyze
>
> Now if I do an explain analyze of a simple join between that table and my
> original table 4
> using a simple = clause, I get a result in one second (around). and the
> planner guesses for rows seems in line with the observed values .
> if I use a substr(table1.a)= table2.b, the explain analyze get a result in
> 21 seconds and the planner estimates a 65M rows result set while the
> observed is 330 k rows
> so here its 20 times slower and the discrepency between planner rows guess
> and reality is a 200 ratio.
>
> Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
> then... I kill the query after a quarter an hour without any answer.
> if I try to just explain the query, the planner rows guess becomes more
> than 2 Billions
> the extremely simple query and plan are here, without automatic obfuscation
> https://explain.depesz.com/s/b8Ll
>

First, I am not sure why you cannot send us the explain analyze.  But
moving on...

substr() is a function that mutilates a value such that the index becomes
useless...
If you are looking for the LEFT() of the value, then an INDEX can be used.
I have COLLATION "C" and when I query:
WHERE fld like  fld_b||"%"

The optimizer constructs a query that uses the index on "fld"...
But when I try:

WHERE fld like CONCAT_WS("", fld_b,"%")
It doesn't use the index version. (because the function call is too
complicated to see through)

When using functions in where clauses, indexes either have to be made on
those functions, or often times the index cannot be used.

BTW, I noted the COLLATION.  That turned out to be important, because my
first DB test did NOT use that collation, and the result
of the LIKE was the non-indexed version...

I hope you find something useful in here.

Also, WHERE fld <> 72...  (unless you have a heavily skewed set of
statistics, I read that as.  SCAN everything, and check later,
because this should filter very few rows), whereas fld = 72 will be
blazingly fast.

Kirk


Re: pb with join plan

2023-06-21 Thread Marc Millas
Marc MILLAS




On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra 
wrote:

> On 6/21/23 00:26, Marc Millas wrote:
> >
> >
> > On Tue, Jun 20, 2023 at 11:19 PM David Rowley  > > wrote:
> >
> > On Wed, 21 Jun 2023 at 08:34, Marc Millas  > > wrote:
> > >
> > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
> > mailto:dgrowle...@gmail.com>> wrote:
> > >>
> > >> On Wed, 21 Jun 2023 at 07:42, Marc Millas  > > wrote:
> > >> > But if I do the same with clause one OR clause 2, I have to
> > kill the request after an hour, seeing the filesystem showing more
> > than 140 Mb of increased usage.
>
> It's a bit weird the "victor" table is joined seemingly without any join
> conditions, leading to a cross join (which massively inflates the cost
> for joins above it). Maybe the anonymized plan mangles it somehow.
>

So I did try to simplify my pb.
I create a table with the result of the first 3 joins.
That table do have 15M lines. all tables have been vacuum analyze

Now if I do an explain analyze of a simple join between that table and my
original table 4
using a simple = clause, I get a result in one second (around). and the
planner guesses for rows seems in line with the observed values .
if I use a substr(table1.a)= table2.b, the explain analyze get a result in
21 seconds and the planner estimates a 65M rows result set while the
observed is 330 k rows
so here its 20 times slower and the discrepency between planner rows guess
and reality is a 200 ratio.

Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
then... I kill the query after a quarter an hour without any answer.
if I try to just explain the query, the planner rows guess becomes more
than 2 Billions
the extremely simple query and plan are here, without automatic obfuscation
https://explain.depesz.com/s/b8Ll




>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Marc MILLAS


Re: pb with join plan

2023-06-21 Thread Marc Millas
On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra 
wrote:

> On 6/21/23 00:26, Marc Millas wrote:
> >
> >
> > On Tue, Jun 20, 2023 at 11:19 PM David Rowley  > > wrote:
> >
> > On Wed, 21 Jun 2023 at 08:34, Marc Millas  > > wrote:
> > >
> > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
> > mailto:dgrowle...@gmail.com>> wrote:
> > >>
> > >> On Wed, 21 Jun 2023 at 07:42, Marc Millas  > > wrote:
> > >> > But if I do the same with clause one OR clause 2, I have to
> > kill the request after an hour, seeing the filesystem showing more
> > than 140 Mb of increased usage.
> > >>
> > >>
> > > link to the anonymized plan of the req with one clause :
> > https://explain.depesz.com/s/TWp4  >
> >
> > link to the plan with the second
> > clause alone: https://explain.depesz.com/s/byW5
> > 
> > link to the plan with both clauses ORed (the one not
> > finishing) https://explain.depesz.com/s/jHO2
> > 
> >
> >
> >
> > It's quite difficult to know what the problem is you want to fix
> here.
> > Your initial post indicated it was the query with the OR condition
> > that was causing you the problems, but the plan you've posted has no
> > OR condition?!
> >
> > You're more likely to get help here if you take time to properly
> > explain the situation and post the information that's actually
> > relevant to the problem you're having, or state the problem more
> > clearly, as there's a mismatch somewhere.
> >
> > It might also be worth having a look at
> > https://wiki.postgresql.org/wiki/Slow_Query_Questions
> >  . EXPLAIN
> is not
> > going to tell us what part of the query is slow. I'll let the wiki
> > page guide you into what to do instead.
> >
> >
> > I know that page. obviously, as I have to kill the request, I cannot
> > provide a explain analyze...
> >
>
> It's a bit weird the "victor" table is joined seemingly without any join
> conditions, leading to a cross join (which massively inflates the cost
> for joins above it). Maybe the anonymized plan mangles it somehow.
>

the query does:
 select blabla from table1 join table2 on (list of 9 fields ANDed and
corresponding to the index of both table1 and table2)
join table3 on table1.a=table3.a and table1.b=table3.b
join table4 on (list of 2 clauses table2.d=table4.e  and one clause
substr(table2.f)=table4.g  all ORed)
table1 and table2 are big (15M and 60M lines), table3 and table4 are small
(30k lines)

basically, if I rewrites the query beginning by the join between table2 and
table4, then join table1 and then table3, postgres generates the same plan,
which doesnt end.

if instead of the 3 clauses of the last join I keep one equality clause,
the explain plan looks the same, but executes in 45 secondes.


>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

 Marc MILLAS


Re: pb with join plan

2023-06-21 Thread Tomas Vondra
On 6/21/23 00:26, Marc Millas wrote:
> 
> 
> On Tue, Jun 20, 2023 at 11:19 PM David Rowley  > wrote:
> 
> On Wed, 21 Jun 2023 at 08:34, Marc Millas  > wrote:
> >
> > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
> mailto:dgrowle...@gmail.com>> wrote:
> >>
> >> On Wed, 21 Jun 2023 at 07:42, Marc Millas  > wrote:
> >> > But if I do the same with clause one OR clause 2, I have to 
> kill the request after an hour, seeing the filesystem showing more
> than 140 Mb of increased usage.
> >>
> >>
> > link to the anonymized plan of the req with one clause :
> https://explain.depesz.com/s/TWp4 
> 
> link to the plan with the second
> clause alone: https://explain.depesz.com/s/byW5
>  
> link to the plan with both clauses ORed (the one not
> finishing) https://explain.depesz.com/s/jHO2
> 
> 
> 
> 
> It's quite difficult to know what the problem is you want to fix here.
> Your initial post indicated it was the query with the OR condition
> that was causing you the problems, but the plan you've posted has no
> OR condition?!
> 
> You're more likely to get help here if you take time to properly
> explain the situation and post the information that's actually
> relevant to the problem you're having, or state the problem more
> clearly, as there's a mismatch somewhere.
> 
> It might also be worth having a look at
> https://wiki.postgresql.org/wiki/Slow_Query_Questions
>  . EXPLAIN is not
> going to tell us what part of the query is slow. I'll let the wiki
> page guide you into what to do instead.
> 
>  
> I know that page. obviously, as I have to kill the request, I cannot
> provide a explain analyze... 
> 

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pb with join plan

2023-06-20 Thread David Rowley
On Wed, 21 Jun 2023 at 10:26, Marc Millas  wrote:
> link to the plan with both clauses ORed (the one not finishing) 
> https://explain.depesz.com/s/jHO2

I'd go with the UNION or UNION ALL idea I mentioned earlier.

David




Re: pb with join plan

2023-06-20 Thread Marc Millas
On Tue, Jun 20, 2023 at 11:19 PM David Rowley  wrote:

> On Wed, 21 Jun 2023 at 08:34, Marc Millas  wrote:
> >
> > On Tue, Jun 20, 2023 at 10:14 PM David Rowley 
> wrote:
> >>
> >> On Wed, 21 Jun 2023 at 07:42, Marc Millas 
> wrote:
> >> > But if I do the same with clause one OR clause 2, I have to  kill the
> request after an hour, seeing the filesystem showing more than 140 Mb of
> increased usage.
> >>
> >>
> > link to the anonymized plan of the req with one clause :
> https://explain.depesz.com/s/TWp4

link to the plan with the second clause alone:
https://explain.depesz.com/s/byW5
link to the plan with both clauses ORed (the one not finishing)
https://explain.depesz.com/s/jHO2

>
>
> It's quite difficult to know what the problem is you want to fix here.
> Your initial post indicated it was the query with the OR condition
> that was causing you the problems, but the plan you've posted has no
> OR condition?!
>
> You're more likely to get help here if you take time to properly
> explain the situation and post the information that's actually
> relevant to the problem you're having, or state the problem more
> clearly, as there's a mismatch somewhere.
>
> It might also be worth having a look at
> https://wiki.postgresql.org/wiki/Slow_Query_Questions . EXPLAIN is not
> going to tell us what part of the query is slow. I'll let the wiki
> page guide you into what to do instead.
>

I know that page. obviously, as I have to kill the request, I cannot
provide a explain analyze...

>
> David
>


Re: pb with join plan

2023-06-20 Thread David Rowley
On Wed, 21 Jun 2023 at 08:34, Marc Millas  wrote:
>
> On Tue, Jun 20, 2023 at 10:14 PM David Rowley  wrote:
>>
>> On Wed, 21 Jun 2023 at 07:42, Marc Millas  wrote:
>> > But if I do the same with clause one OR clause 2, I have to  kill the 
>> > request after an hour, seeing the filesystem showing more than 140 Mb of 
>> > increased usage.
>>
>>
> link to the anonymized plan of the req with one clause : 
> https://explain.depesz.com/s/TWp4

It's quite difficult to know what the problem is you want to fix here.
Your initial post indicated it was the query with the OR condition
that was causing you the problems, but the plan you've posted has no
OR condition?!

You're more likely to get help here if you take time to properly
explain the situation and post the information that's actually
relevant to the problem you're having, or state the problem more
clearly, as there's a mismatch somewhere.

It might also be worth having a look at
https://wiki.postgresql.org/wiki/Slow_Query_Questions . EXPLAIN is not
going to tell us what part of the query is slow. I'll let the wiki
page guide you into what to do instead.

David




Re: pb with join plan

2023-06-20 Thread Marc Millas
On Tue, Jun 20, 2023 at 10:14 PM David Rowley  wrote:

> On Wed, 21 Jun 2023 at 07:42, Marc Millas  wrote:
> > But if I do the same with clause one OR clause 2, I have to  kill the
> request after an hour, seeing the filesystem showing more than 140 Mb of
> increased usage.
>
> > So, before providing the 3 explain plans (I must anonymize everything,
> so somewhat boring)  I would like to know if there is some obvious thing I
> am missing.
> > all tables have been vacuum analyzed.
>
> I believe you can anonymise the explain with https://explain.depesz.com/
>
> link to the anonymized plan of the req with one clause :
https://explain.depesz.com/s/TWp4

It's pretty hard to say until we see the query, but having an OR in
> the join condition makes it impossible to Hash or Merge join, so
> perhaps it's slow due to Nested Loop join.
>
> You could consider rewriting the query to use a UNION or a UNION ALL
> separating out each branch of the OR into a UNION of its own.  That
> would allow Hash and Merge join to work again. However, that's all
> speculation until you provide more details.
>
> https://wiki.postgresql.org/wiki/Guide_to_reporting_problems
>
> David
>


  Marc MILLAS


Re: pb with join plan

2023-06-20 Thread David Rowley
On Wed, 21 Jun 2023 at 07:42, Marc Millas  wrote:
> But if I do the same with clause one OR clause 2, I have to  kill the request 
> after an hour, seeing the filesystem showing more than 140 Mb of increased 
> usage.

> So, before providing the 3 explain plans (I must anonymize everything, so 
> somewhat boring)  I would like to know if there is some obvious thing I am 
> missing.
> all tables have been vacuum analyzed.

I believe you can anonymise the explain with https://explain.depesz.com/

It's pretty hard to say until we see the query, but having an OR in
the join condition makes it impossible to Hash or Merge join, so
perhaps it's slow due to Nested Loop join.

You could consider rewriting the query to use a UNION or a UNION ALL
separating out each branch of the OR into a UNION of its own.  That
would allow Hash and Merge join to work again. However, that's all
speculation until you provide more details.

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

David




pb with join plan

2023-06-20 Thread Marc Millas
Hi,

I have a postgres 14 on linux with a 15 TB db, with 20 cores and 150GB
RAM, all nvme ssd. . Currently one user  :-)

A join between 2 big tables and then another join with  a smaller third one
takes less than 1 minute and provides a result of 15 M lines. Fine.

if I do add a third join, with a 30k lines table, with a simple equality as
join clause it does work almost as fast. explain analyze indicates 45 sec.
same if i do the very same with another equality clause. explain analyze
indicates 140 sec.

But if I do the same with clause one OR clause 2, I have to  kill the
request after an hour, seeing the filesystem showing more than 140 Mb of
increased usage.

Looking at the explain plan with one clause or the 2 ORed, there are
changes in the plan (of course)
with the fastest clause the estimated cost is 3 700 000 and with the a bit
slower one 3 900 000.
with both ORed, the estimated cost is 16 000 000. To me it does sound a bit
strange, as ORing the join clauses should add times, but not more (so so)

So, before providing the 3 explain plans (I must anonymize everything, so
somewhat boring)  I would like to know if there is some obvious thing I am
missing.
all tables have been vacuum analyzed.

thanks



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com