performance degradation using between or greater than/smaller than operator in
subselect clause
-----------------------------------------------------------------------------------------------
Key: CORE-5369
URL: http://tracker.firebirdsql.org/browse/CORE-5369
Project: Firebird Core
Issue Type: Bug
Components: Engine
Affects Versions: 2.5.5
Environment: centos 7.2.511, 2 x 6 core Xeon E5-1650, 128 GByte RAM,
running on SSD using package firebird-superclassic.x86_64 from EPEL repo for
centos 7
Reporter: Dietrich Streifert
Attachments: fbissue-core-5369.zip
We have some query statements using subselects where both the main select and
the subselect use aggregate functions like avg, count, min and max.
We have massive performance degradation if we use between in the clause of the
subselect or if we use greater than and smaller than equivalent to the between
operator.
The select and subselect involves two tables, where the master table has ~
144000 entries and the detail table ~ 560000 entries.
Here is the DDL for the two tables:
create table master (
id integer not null,
nr integer not null
);
alter table master add constraint pk_master primary key (id, nr);
create table detail (
id integer not null,
nr integer not null,
ts timestamp not null,
tday integer,
val integer
);
alter table detail add constraint pk_detail primary key (id, nr, ts);
create index idx_detail_tday on detail (tday);
create index idx_detail_val on detail (val);
And here the is query which is causing the degradation, with an execution time
of ~ 8 minutes, whereby it does not differ if a between operator is used or a
greater than / lower than pair ( d.tday >= 1 and d.tday <= 14 ):
select max( (
select avg(val) from detail d
where d.id = m.id and d.nr = m.nr and
d.tday between 1 and 14
) ) as mvalue, count(*) as num
from master m;
If the between operator is replaced by an 'in' operator like this, the
execution time drops to ~ 1 second:
select max( (
select avg(val) from detail d
where d.id = m.id and d.nr = m.nr and
d.tday in ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 )
) ) as mvalue, count(*) as num
from master m;
The DDL, Data and DML used is attached as ZIP file to this issue, containing
the test data which is needed to reproduce this behaviour.
Note that if the in set contains only few values like ( 1, 2, 3 ) there is not
difference between both versions of the query.
>From my understanding the in operator version should be slower than the
>between operator version.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel