performance degradation using between or greater than/smaller than operator in 
subselect clause
-----------------------------------------------------------------------------------------------

                 Key: CORE-5369
                 URL: http://tracker.firebirdsql.org/browse/CORE-5369
             Project: Firebird Core
          Issue Type: Bug
          Components: Engine
    Affects Versions: 2.5.5
         Environment: centos 7.2.511, 2 x 6 core Xeon E5-1650, 128 GByte RAM, 
running on SSD using package  firebird-superclassic.x86_64 from EPEL repo for 
centos 7
            Reporter: Dietrich Streifert
         Attachments: fbissue-core-5369.zip

We have some query statements using subselects where both the main select and 
the subselect use aggregate functions like avg, count, min and max.

We have massive performance degradation if we use between in the clause of the 
subselect or if we use greater than and smaller than equivalent to the between 
operator.

The select and subselect involves two tables, where the master table has ~ 
144000 entries and the detail table ~ 560000 entries.

Here is the DDL for the two tables:

create table master (
  id integer not null,
  nr integer not null
);
alter table master add constraint pk_master primary key (id, nr);

create table detail (
  id integer not null,
  nr integer not null,
  ts timestamp not null,
  tday integer,
  val integer
);
alter table detail add constraint pk_detail primary key (id, nr, ts);
create index idx_detail_tday on detail (tday);
create index idx_detail_val on detail (val);

And here the is query which is causing the degradation, with an execution time 
of ~ 8 minutes, whereby it does not differ if a between operator is used or a 
greater than / lower than pair ( d.tday >= 1 and d.tday <= 14 ):

select max( (
  select avg(val) from detail d
  where d.id = m.id and d.nr = m.nr and
    d.tday between 1 and 14
) ) as mvalue, count(*) as num
from master m;

If the between operator is replaced by an 'in' operator like this, the 
execution time drops to ~ 1 second:

select max( (
  select avg(val) from detail d
  where d.id = m.id and d.nr = m.nr and
    d.tday in ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 )
) ) as mvalue, count(*) as num
from master m;

The DDL, Data and DML used is attached as ZIP file to this issue, containing 
the test data which is needed to reproduce this behaviour.

Note that if the in set contains only few values like ( 1, 2, 3 ) there is not 
difference between both versions of the query.


>From my understanding the in operator version should be slower than the 
>between operator version.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to