LIKE operator: make its performance insensitive to the length of string when 
pattern is immediatelly founded
------------------------------------------------------------------------------------------------------------

                 Key: CORE-4306
                 URL: http://tracker.firebirdsql.org/browse/CORE-4306
             Project: Firebird Core
          Issue Type: Improvement
          Components: Engine
    Affects Versions: 3.0 Alpha 1
            Reporter: Pavel Zotov
            Priority: Minor


It was found that performance of LIKE operator becomes very  poor in case when 
string have big length - e.g. 32K - though the character that matches to 
pattern is just at the beginning of this string.

TEST:
=====
set stat on;
set term ^;
execute block returns(cnt int) as
declare n int = 10000000;
declare s varchar(32760);
begin
  cnt=0;
  s=rpad('q',32760,' '); -- string has huge length but BEGINS with character `q`
  while (n>0) do begin
    cnt = cnt + iif( s like '%q%', 1, 0); -- such pattern should be found 
immediatelly
    n=n-1;
  end
  suspend;
end^
set term ;^
set stat off;

output:
======
Current memory = 2450596208
Delta memory = 390320
Max memory = 2450628912
Elapsed time= 34.18 sec -- <<<<<<<<<<<<<<<<<<<<< :(((
Cpu = 0.00 sec
Buffers = 524288
Reads = 16
Writes = 0
Fetches = 37
SQL> set term ;^
SQL> set stat off;


Compare with:
============

set stat on;
set term ^;
execute block returns(cnt int) as
declare n int = 10000000;
declare s varchar(50); 
begin
  cnt=0;
  s=rpad('q',50,' '); -- same string, but of small length 
  while (n>0) do begin
    cnt = cnt + iif( s like '%q%', 1, 0);
    n=n-1;
  end
  suspend;
end^
set term ;^
set stat off;


output:
=====
Current memory = 2450530424
Delta memory = -65784
Max memory = 2450628912
Elapsed time= 6.68 sec -- <<<<<<<<<<<<<<<<< much better
Cpu = 0.00 sec
Buffers = 524288
Reads = 0
Writes = 0
Fetches = 2

PS. compare with Java, (only FYI):
==================

import static java.lang.System.*;
import java.util.regex.*;
class MatchesCount {
  public static void main(String[] args) {
    int pad = args.length > 0 ? Integer.parseInt(args[0]) : 32760;
    String s = String.format("%1$-" + pad + "s", "q");
    //out.println("s=>"+s+"<");
    Pattern p = Pattern.compile(".*?q.*?");
    Matcher m;
    int cnt=0;
    long t0 = currentTimeMillis();
    for(int i=0; i < 10000000; i++) {
      m=p.matcher(s);
      cnt += m.find() ? 1 : 0;
    }
    out.println("s.length()="+s.length()+", cnt="+cnt+", done at "+(  
currentTimeMillis() - t0 )+" ms");
  }
}

====
D:\JAVA>java MatchesCount 10
s.length()=10, cnt=10000000, done at 5562 ms

D:\JAVA>java MatchesCount
s.length()=32760, cnt=10000000, done at 5860 ms

D:\JAVA>java MatchesCount 256000
s.length()=256000, cnt=10000000, done at 5594 ms

D:\JAVA>java MatchesCount 512000
s.length()=512000, cnt=10000000, done at 5735 ms

As we can see, Java has regexp engine that is NO sensitive to the length of 
source string (in case when pattern is founded immediatelly).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to