Hi All, Been watching recently and decided to have a quick gander hoping to reproduce the OPs original issues with slower selects from a table containing a varchar(10000) versus a table containing no varchar(10000) field. Test rig:
Centos 5.3 32 bit Firebird 2.5.0 Classic Server 32 bit Procedure: - I created a database containing two tables - DESC_EMPTY (IDOBJ VARCHAR(20)) and DESC_VARCHAR (IDOBJ VARCHAR(20), COMMENT VARCHAR(10000)). - Populated DESC_VARCHAR with 50 million test records - IDOBJ populated with GEN_UUID() and COMMENT populated with random concatenated GEN_UUID()s. - Populated DESC_EMPTY using "insert into desc_empty select idobj from desc_varchar;" - Create and index on each tables idobj column. - Perform a backup and restore. Results: [root@ppdev prep]# ls -lh -rw-rw---- 1 root root 22G Mar 9 15:21 test2.gdb ---------------------------------------------------------- -A quick gstat -r to see what's going on ---------------------------------------------------------- [root@ppdev prep]# gstat -r test2.gdb Database "test2.gdb" Database header page information: Flags 0 Checksum 12345 Generation 16 Page size 8192 ODS version 11.2 Oldest transaction 1 Oldest active 2 Oldest snapshot 2 Next transaction 8 Bumped transaction 1 Sequence number 0 Next attachment ID 2 Implementation ID 19 Shadow count 0 Page buffers 0 Next header page 0 Database dialect 3 Creation date Mar 9, 2012 14:20:12 Attributes force write Variable header data: Sweep interval: 20000 *END* Database file sequence: File test2.gdb is the only file Analyzing database pages ... DESC_EMPTY (129) Primary pointer page: 153, Index root page: 154 Average record length: 25.00, total records: 50000000 Average version length: 0.00, total versions: 0, max versions: 0 Data pages: 403226, data page slots: 403226, average fill: 64% Fill distribution: 0 - 19% = 0 20 - 39% = 0 40 - 59% = 1 60 - 79% = 403225 80 - 99% = 0 Index DESC_EMPTY_IDX (0) Depth: 4, leaf buckets: 123965, nodes: 50000000 Average data length: 13.32, total dup: 0, max dup: 0 Fill distribution: 0 - 19% = 0 20 - 39% = 0 40 - 59% = 1 60 - 79% = 0 80 - 99% = 123964 DESC_VARCHAR (128) Primary pointer page: 151, Index root page: 152 Average record length: 312.99, total records: 50000000 Average version length: 0.00, total versions: 0, max versions: 0 Data pages: 2173914, data page slots: 2173914, average fill: 93% Fill distribution: 0 - 19% = 1 20 - 39% = 0 40 - 59% = 0 60 - 79% = 0 80 - 99% = 2173913 Index DESC_VARCHAR_IDX (0) Depth: 4, leaf buckets: 125733, nodes: 50000000 Average data length: 13.32, total dup: 0, max dup: 0 Fill distribution: 0 - 19% = 0 20 - 39% = 0 40 - 59% = 0 60 - 79% = 0 80 - 99% = 125733 ---------------------------------------------------------- - Perform each test select twice - initially this will populate the OS cache. The second run will eliminate disk IO - we're interested in the page reads regardless. ---------------------------------------------------------- [root@ppdev prep]# isql test2.gdb Database: test2.gdb SQL> show tables; DESC_EMPTY DESC_VARCHAR SQL> show table desc_varchar; IDOBJ VARCHAR(20) Nullable COMMENT VARCHAR(10000) Nullable SQL> show table desc_empty; IDOBJ VARCHAR(20) Nullable SQL> set stats on; SQL> set plan on; SQL> select idobj from desc_varchar where idobj = 'MONKEY'; PLAN (DESC_VARCHAR INDEX (DESC_VARCHAR_IDX)) Current memory = 981988 Delta memory = 71796 Max memory = 1067700 Elapsed time= 8.56 sec Cpu = 0.00 sec Buffers = 75 Reads = 1167 Writes = 0 Fetches = 1299 SQL> select idobj from desc_empty where idobj = 'MONKEY'; PLAN (DESC_EMPTY INDEX (DESC_EMPTY_IDX)) Current memory = 983476 Delta memory = 1488 Max memory = 1116260 Elapsed time= 1.89 sec Cpu = 0.00 sec Buffers = 75 Reads = 238 Writes = 0 Fetches = 273 SQL> select idobj from desc_varchar where idobj = 'MONKEY'; PLAN (DESC_VARCHAR INDEX (DESC_VARCHAR_IDX)) Current memory = 983480 Delta memory = 4 Max memory = 1116260 Elapsed time= 0.01 sec Cpu = 0.00 sec Buffers = 75 Reads = 1143 Writes = 0 Fetches = 1143 SQL> select idobj from desc_empty where idobj = 'MONKEY'; PLAN (DESC_EMPTY INDEX (DESC_EMPTY_IDX)) Current memory = 983476 Delta memory = -4 Max memory = 1116260 Elapsed time= 0.00 sec Cpu = 0.00 sec Buffers = 75 Reads = 221 Writes = 0 Fetches = 221 SQL> show indexes; ---------------------------------------------------------- - show indices... just to be safe ;) ---------------------------------------------------------- DESC_EMPTY_IDX INDEX ON DESC_EMPTY(IDOBJ) DESC_VARCHAR_IDX INDEX ON DESC_VARCHAR(IDOBJ) SQL> ---------------------------------------------------------- Conclusion: Baffled why a select (which could never possibly return a result) reads ~1150 pages from DESC_VARCHAR but only ~230 pages from DESC_EMPTY given that the two indices on IDOBJ columns are virtually identical (identical data, identical fills etc). I can only image that the unrelated contents of a table somehow impact on the performance of an index - no idea why. Granted, the OS cache in Linux combined with sufficient RAM means that it's more of a non-issue than an issue for me. I can still imagine that large databases with a high number of concurrent users querying a random set of values would see higher disk utilisation than they should. Anyhow, just thought I'd share the results of the quick test I did. Hope this helps someone. Regards, David