LIKE operator: make its performance insensitive to the length of string when pattern is immediatelly founded ------------------------------------------------------------------------------------------------------------
Key: CORE-4306 URL: http://tracker.firebirdsql.org/browse/CORE-4306 Project: Firebird Core Issue Type: Improvement Components: Engine Affects Versions: 3.0 Alpha 1 Reporter: Pavel Zotov Priority: Minor It was found that performance of LIKE operator becomes very poor in case when string have big length - e.g. 32K - though the character that matches to pattern is just at the beginning of this string. TEST: ===== set stat on; set term ^; execute block returns(cnt int) as declare n int = 10000000; declare s varchar(32760); begin cnt=0; s=rpad('q',32760,' '); -- string has huge length but BEGINS with character `q` while (n>0) do begin cnt = cnt + iif( s like '%q%', 1, 0); -- such pattern should be found immediatelly n=n-1; end suspend; end^ set term ;^ set stat off; output: ====== Current memory = 2450596208 Delta memory = 390320 Max memory = 2450628912 Elapsed time= 34.18 sec -- <<<<<<<<<<<<<<<<<<<<< :((( Cpu = 0.00 sec Buffers = 524288 Reads = 16 Writes = 0 Fetches = 37 SQL> set term ;^ SQL> set stat off; Compare with: ============ set stat on; set term ^; execute block returns(cnt int) as declare n int = 10000000; declare s varchar(50); begin cnt=0; s=rpad('q',50,' '); -- same string, but of small length while (n>0) do begin cnt = cnt + iif( s like '%q%', 1, 0); n=n-1; end suspend; end^ set term ;^ set stat off; output: ===== Current memory = 2450530424 Delta memory = -65784 Max memory = 2450628912 Elapsed time= 6.68 sec -- <<<<<<<<<<<<<<<<< much better Cpu = 0.00 sec Buffers = 524288 Reads = 0 Writes = 0 Fetches = 2 PS. compare with Java, (only FYI): ================== import static java.lang.System.*; import java.util.regex.*; class MatchesCount { public static void main(String[] args) { int pad = args.length > 0 ? Integer.parseInt(args[0]) : 32760; String s = String.format("%1$-" + pad + "s", "q"); //out.println("s=>"+s+"<"); Pattern p = Pattern.compile(".*?q.*?"); Matcher m; int cnt=0; long t0 = currentTimeMillis(); for(int i=0; i < 10000000; i++) { m=p.matcher(s); cnt += m.find() ? 1 : 0; } out.println("s.length()="+s.length()+", cnt="+cnt+", done at "+( currentTimeMillis() - t0 )+" ms"); } } ==== D:\JAVA>java MatchesCount 10 s.length()=10, cnt=10000000, done at 5562 ms D:\JAVA>java MatchesCount s.length()=32760, cnt=10000000, done at 5860 ms D:\JAVA>java MatchesCount 256000 s.length()=256000, cnt=10000000, done at 5594 ms D:\JAVA>java MatchesCount 512000 s.length()=512000, cnt=10000000, done at 5735 ms As we can see, Java has regexp engine that is NO sensitive to the length of source string (in case when pattern is founded immediatelly). -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://tracker.firebirdsql.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel