Inefficient memory usage and speed of SIMILAR TO when handling long text BLOB ------------------------------------------------------------------------------
Key: CORE-3773 URL: http://tracker.firebirdsql.org/browse/CORE-3773 Project: Firebird Core Issue Type: Task Affects Versions: 2.5.1 Reporter: Pavel Zotov Consider the following task: a very long text (blob) have to be compared with some pattern via SIMILAR TO. The pattern can be very simple, even single character. On my PC with 1 Gb the following test will fail after blob exceeds *EXACTLY* 8 Mb of length: ---------------------- SQL> show version; ISQL Version: WI-V2.5.2.26390 Firebird 2.5 Server version: Firebird/x86/Windows NT (access method), version "WI-V2.5.2.26426 Firebird 2.5" Firebird/x86/Windows NT (remote server), version "WI-V2.5.2.26426 Firebird 2.5/XNet (BALAHA)/P12" Firebird/x86/Windows NT (remote interface), version "WI-V2.5.2.26390 Firebird 2.5/XNet (BALAHA)/P12" on disk structure version 11.2 --------------------- recreate table tmpnum(id int); recreate table tb(id int not null, s blob); commit; set term ^; execute block as declare n int = 999; declare n0 int; declare x blob; begin n0=n; while (n>=0) do begin insert into tmpnum(id) values(:n); n=n-1; end x=(select list( replace(uuid_to_char(gen_uuid()),'-','') ) from tmpnum t1, tmpnum t2 where t1.id*1000+t2.id < 254200 ); insert into tb values(1, :x); delete from tmpnum; end^ set term ;^ commit; drop table tmpnum; commit; select char_length(s) bLen, cast(char_length(s) as double precision)/1024/1024 mbLen from tb; commit; select iif( s similar to '%'||cast(left(s,1) as varchar(1))||'%', 1, 0) is_matched from tb; rollback; If I replace 'magic number' 254200 with 254201 the following error will be: IS_MATCHED ============ Statement failed, SQLSTATE = HY001 unable to allocate memory from operating system After line 34 in file filblob But I have about 700 Mb free memory. ---------------------------------------------------------------------------------------------------------------------------------------------------------- When I run the same test on linux-server with 48 Gb RAM then this limit is much greater. So I have increased the size of blob up to ~315 Mb I could not etsablish the limit like it was in my PC becase segfault occured and core dumped. Database file is about 350 Mb: ls -la t.fdb -rw-rw---- 1 firebird firebird 374865920 Feb 25 14:21 t.fdb This is the testcase for LINUX: --------------------- SQL> show version; ISQL Version: LI-V2.5.2.26390 Firebird 2.5 Server version: Firebird/linux AMD64 (access method), version "LI-V2.5.2.26390 Firebird 2.5" on disk structure version 11.2 --------------------- recreate table tmpnum(id int); recreate table tb(id int not null, s blob); commit; set term ^; execute block as declare n int = 9999; declare n0 int; declare x blob; begin n0=n; while (n>=0) do begin insert into tmpnum(id) values(:n); n=n-1; end x=(select list( replace(uuid_to_char(gen_uuid()),'-','') ) from tmpnum t1, tmpnum t2 where t1.id*10000+t2.id < 10000000 ); insert into tb values(1, :x); delete from tmpnum; end^ set term ;^ commit; -- 5min drop table tmpnum; commit; select char_length(s) bLen, cast(char_length(s) as double precision)/1024/1024 mbLen from tb; commit; select iif( s similar to '%'||cast(left(s,1) as varchar(1))||'%', 1, 0) is_matched from tb; rollback; And these line are from `top` utility - please note that memory consumption of isq is extremely huge: top - 14:21:53 up 78 days, 4:56, 3 users, load average: 1.30, 0.69, 0.42 Tasks: 278 total, 3 running, 275 sleeping, 0 stopped, 0 zombie Cpu(s): 11.4%us, 5.3%sy, 0.0%ni, 67.9%id, 15.0%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 49548000k total, 49399092k used, 148908k free, 6304k buffers Swap: 51367828k total, 602964k used, 50764864k free, 12856724k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9923 firebird 20 0 40.9g 31g 6056 R 99.2 66.4 3:57.00 isql 82 root 20 0 0 0 0 S 11.6 0.0 1:06.29 kswapd1 #################################################### [firebird@firebirdG ~]$ top top - 14:23:11 up 78 days, 4:57, 3 users, load average: 8.40, 3.19, 1.33 Tasks: 278 total, 2 running, 276 sleeping, 0 stopped, 0 zombie Cpu(s): 0.8%us, 10.7%sy, 0.0%ni, 64.9%id, 23.0%wa, 0.0%hi, 0.7%si, 0.0%st Mem: 49548000k total, 49411152k used, 136848k free, 9584k buffers Swap: 51367828k total, 724968k used, 50642860k free, 13152596k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 81 root 20 0 0 0 0 D 24.6 0.0 1:02.23 kswapd0 9923 firebird 20 0 40.9g 31g 6064 D 21.9 66.2 4:19.61 isql #################################################### top - 14:26:09 up 78 days, 5:00, 3 users, load average: 7.99, 5.26, 2.46 Tasks: 279 total, 1 running, 278 sleeping, 0 stopped, 0 zombie Cpu(s): 1.1%us, 6.0%sy, 0.0%ni, 59.4%id, 33.2%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 49548000k total, 49404064k used, 143936k free, 19320k buffers Swap: 51367828k total, 2581320k used, 48786508k free, 13678892k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9923 firebird 20 0 40.9g 30g 5960 D 13.9 65.1 4:38.79 isql 9276 root 20 0 0 0 0 S 11.9 0.0 1:39.96 vmware-rtc ------------- PS. I can not understand why SIMILAR_TO continues even when matching is establish - please note that in these tests we must to find matching blob to the first character of *this* blob. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://tracker.firebirdsql.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel