Re: [h2] Request for advice

2023-04-26 Thread Silvio
That is actually a very good idea. We never considered using a UNION here. 
Would H2 be able to use a different index for multiple queries inside a 
single UNION? I was under the impression H2 always uses one index for a 
query but that may no longer be the case. If that is true this could 
improve things a lot.

The suggestion about the temp table creation is also valuable. We have to 
account for multiple users running the same type of queries at the same 
time so we would need a separate table for each of them or complicate 
things by adding additional information to the temp table to distinguish 
the records for separate queries.

On Wednesday, 26 April 2023 at 11:12:19 UTC+2 Noel Grandin wrote:

>
>
> On 4/26/2023 11:03 AM, Silvio wrote:
> > We have some heavy queries that involve selecting records from a base 
> cached table A (~100K records) that satisfy a 
> > quite a number of conditions expressed as
> > 
> > A.PK [NOT] IN (...)
> > 
>
> You could also express those as
>
> A.PK NOT IN ( SubNotCondition1 UNION SubNotCondition2 ... )
> AND A.PK IN ( SubCondition1 UNION SubCondition2 ... )
>
> which should result in less scanning of the second table.
>
> > 
> > We are thinking of using a temporary memory table C that holds primary 
> keys of table A, evaluating the subqueries on B 
> > seperately inserting or removing keys into table C as needed and finally 
> having a single subquery
> > 
>
> That might help, and I have done similar things (on other databases).
>
> You probably want a session-local temporary table, and you probably want 
> to do
>
> CREATE IF NOT EXISTS TEMP1
> TRUNCATE TEMP1
>
> for each query, rather than creating and dropping it, because CREATE/DROP 
> is quite expensive in H2.
>
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/a5d488ea-efc1-4154-b9c1-128ed08408e5n%40googlegroups.com.


Re: [h2] Request for advice

2023-04-26 Thread Silvio
Unfortunately we are talking about a production situation and using a 
h2-version that is not officially released (for whatever that means) is 
something we are not allowed to do. I would be very eager to check what 
happens with any new upcoming release.

A join would naturally work but I would also expect the subquery to result 
in a similar evaluation path. If there is any reason to expect otherwise we 
would happily use the join.

We initially did a multi-join instead of the subqueries we have now. That 
was beyond slow. We never saw any of those queries come to completion.

On Wednesday, 26 April 2023 at 11:09:58 UTC+2 Andreas Reichel wrote:

> Greetings.
>
> As far as I understand it, Evgenji just committed a change regarding index 
> use in `IN()` clause.
> Maybe try the very lastest GIT Master first.
>
> Also I wonder, why you would not use a JOIN instead of a IN() when you 
> have a list of FK_A from C:
>
> select *
> from a
> inner join c
> on a.pk = c.fk_a
>
> Why use (uncorrelated?) sub-queries?
>
> Best regards
> Andreas
>
> On Wed, 2023-04-26 at 02:03 -0700, Silvio wrote:
>
> We have some heavy queries that involve selecting records from a base 
> cached table A (~100K records) that satisfy a quite a number of conditions 
> expressed as
>
> A.PK [NOT] IN (...)
>
> on a secondary cached table B (~10M) records. Although the subqueries use 
> indexed columns the overall query is very slow.
>
> We are thinking of using a temporary memory table C that holds primary 
> keys of table A, evaluating the subqueries on B seperately inserting or 
> removing keys into table C as needed and finally having a single subquery
>
> A.PK NOT IN (SELECT FK_A FROM C)
>
> Has anoyone ever tried such an approach in H2? Is there any reason to 
> expect an improvement in performance in comparison to the single large 
> query we have now?
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to h2-database...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/h2-database/6e7570af-a74c-4e83-9560-a85cfad1e8d4n%40googlegroups.com
>  
> 
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/570f46fc-c14f-4242-8ac8-c8b3e3da9d6bn%40googlegroups.com.


Re: [h2] Request for advice

2023-04-26 Thread Noel Grandin




On 4/26/2023 11:03 AM, Silvio wrote:
We have some heavy queries that involve selecting records from a base cached table A (~100K records) that satisfy a 
quite a number of conditions expressed as


A.PK [NOT] IN (...)



You could also express those as

  A.PK NOT IN (  SubNotCondition1 UNION SubNotCondition2 ... )
  AND A.PK IN ( SubCondition1 UNION SubCondition2 ... )

which should result in less scanning of the second table.



We are thinking of using a temporary memory table C that holds primary keys of table A, evaluating the subqueries on B 
seperately inserting or removing keys into table C as needed and finally having a single subquery




That might help, and I have done similar things (on other databases).

You probably want a session-local temporary table, and you probably want to do

CREATE IF NOT EXISTS TEMP1
TRUNCATE TEMP1

for each query, rather than creating and dropping it, because CREATE/DROP is 
quite expensive in H2.

--
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/84204ee9-6404-35a5-7766-47ddab91d855%40gmail.com.


Re: [h2] Request for advice

2023-04-26 Thread Andreas Reichel
Greetings.

As far as I understand it, Evgenji just committed a change regarding
index use in `IN()` clause.
Maybe try the very lastest GIT Master first.

Also I wonder, why you would not use a JOIN instead of a IN() when you
have a list of FK_A from C:

select *
from a
inner join c
on a.pk = c.fk_a

Why use (uncorrelated?) sub-queries?

Best regards
Andreas

On Wed, 2023-04-26 at 02:03 -0700, Silvio wrote:
> We have some heavy queries that involve selecting records from a base
> cached table A (~100K records) that satisfy a quite a number of
> conditions expressed as
> 
> A.PK [NOT] IN (...)
> 
> on a secondary cached table B (~10M) records. Although the subqueries
> use indexed columns the overall query is very slow.
> 
> We are thinking of using a temporary memory table C that holds
> primary keys of table A, evaluating the subqueries on B seperately
> inserting or removing keys into table C as needed and finally having
> a single subquery
> 
> A.PK NOT IN (SELECT FK_A FROM C)
> 
> Has anoyone ever tried such an approach in H2? Is there any reason to
> expect an improvement in performance in comparison to the single
> large query we have now?
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to h2-database+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/h2-database/6e7570af-a74c-4e83-9560-a85cfad1e8d4n%40googlegroups.com
> .

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/798e67aa1384cc368d8af4fdcdae95b57e87435c.camel%40manticore-projects.com.


[h2] Request for advice

2023-04-26 Thread Silvio
We have some heavy queries that involve selecting records from a base 
cached table A (~100K records) that satisfy a quite a number of conditions 
expressed as

A.PK [NOT] IN (...)

on a secondary cached table B (~10M) records. Although the subqueries use 
indexed columns the overall query is very slow.

We are thinking of using a temporary memory table C that holds primary keys 
of table A, evaluating the subqueries on B seperately inserting or removing 
keys into table C as needed and finally having a single subquery

A.PK NOT IN (SELECT FK_A FROM C)

Has anoyone ever tried such an approach in H2? Is there any reason to 
expect an improvement in performance in comparison to the single large 
query we have now?

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/6e7570af-a74c-4e83-9560-a85cfad1e8d4n%40googlegroups.com.