Hi All,

Been watching recently and decided to have a quick gander hoping to
reproduce the OPs original issues with slower selects from a table
containing a varchar(10000) versus a table containing no varchar(10000)
field.
 
Test rig:

Centos 5.3 32 bit
Firebird 2.5.0 Classic Server 32 bit

Procedure:

- I created a database containing two tables - DESC_EMPTY (IDOBJ
VARCHAR(20)) and DESC_VARCHAR (IDOBJ VARCHAR(20), COMMENT VARCHAR(10000)).
- Populated DESC_VARCHAR with 50 million test records - IDOBJ populated with
GEN_UUID() and COMMENT populated with random concatenated GEN_UUID()s.
- Populated DESC_EMPTY using "insert into desc_empty select idobj from
desc_varchar;"
- Create and index on each tables idobj column.
- Perform a backup and restore.

Results:

[root@ppdev prep]# ls -lh
-rw-rw---- 1 root root  22G Mar  9 15:21 test2.gdb

----------------------------------------------------------
-A quick gstat -r to see what's going on
----------------------------------------------------------

[root@ppdev prep]# gstat -r test2.gdb

Database "test2.gdb"
Database header page information:
        Flags                   0
        Checksum                12345
        Generation              16
        Page size               8192
        ODS version             11.2
        Oldest transaction      1
        Oldest active           2
        Oldest snapshot         2
        Next transaction        8
        Bumped transaction      1
        Sequence number         0
        Next attachment ID      2
        Implementation ID       19
        Shadow count            0
        Page buffers            0
        Next header page        0
        Database dialect        3
        Creation date           Mar 9, 2012 14:20:12
        Attributes              force write

    Variable header data:
        Sweep interval:         20000
        *END*


Database file sequence:
File test2.gdb is the only file

Analyzing database pages ...
DESC_EMPTY (129)
    Primary pointer page: 153, Index root page: 154
    Average record length: 25.00, total records: 50000000
    Average version length: 0.00, total versions: 0, max versions: 0
    Data pages: 403226, data page slots: 403226, average fill: 64%
    Fill distribution:
         0 - 19% = 0
        20 - 39% = 0
        40 - 59% = 1
        60 - 79% = 403225
        80 - 99% = 0

    Index DESC_EMPTY_IDX (0)
        Depth: 4, leaf buckets: 123965, nodes: 50000000
        Average data length: 13.32, total dup: 0, max dup: 0
        Fill distribution:
             0 - 19% = 0
            20 - 39% = 0
            40 - 59% = 1
            60 - 79% = 0
            80 - 99% = 123964

DESC_VARCHAR (128)
    Primary pointer page: 151, Index root page: 152
    Average record length: 312.99, total records: 50000000
    Average version length: 0.00, total versions: 0, max versions: 0
    Data pages: 2173914, data page slots: 2173914, average fill: 93%
    Fill distribution:
         0 - 19% = 1
        20 - 39% = 0
        40 - 59% = 0
        60 - 79% = 0
        80 - 99% = 2173913

    Index DESC_VARCHAR_IDX (0)
        Depth: 4, leaf buckets: 125733, nodes: 50000000
        Average data length: 13.32, total dup: 0, max dup: 0
        Fill distribution:
             0 - 19% = 0
            20 - 39% = 0
            40 - 59% = 0
            60 - 79% = 0
            80 - 99% = 125733


----------------------------------------------------------
- Perform each test select twice - initially this will populate the OS
cache. The second run will eliminate disk IO - we're interested in the page
reads regardless.
----------------------------------------------------------

[root@ppdev prep]# isql test2.gdb
Database:  test2.gdb
SQL> show tables;
       DESC_EMPTY                             DESC_VARCHAR

SQL> show table desc_varchar;
IDOBJ                           VARCHAR(20) Nullable
COMMENT                         VARCHAR(10000) Nullable
SQL> show table desc_empty;
IDOBJ                           VARCHAR(20) Nullable
SQL> set stats on;
SQL> set plan on;
SQL> select idobj from desc_varchar where idobj = 'MONKEY';

PLAN (DESC_VARCHAR INDEX (DESC_VARCHAR_IDX))
Current memory = 981988
Delta memory = 71796
Max memory = 1067700
Elapsed time= 8.56 sec
Cpu = 0.00 sec
Buffers = 75
Reads = 1167
Writes = 0
Fetches = 1299
SQL> select idobj from desc_empty where idobj = 'MONKEY';

PLAN (DESC_EMPTY INDEX (DESC_EMPTY_IDX))
Current memory = 983476
Delta memory = 1488
Max memory = 1116260
Elapsed time= 1.89 sec
Cpu = 0.00 sec
Buffers = 75
Reads = 238
Writes = 0
Fetches = 273
SQL> select idobj from desc_varchar where idobj = 'MONKEY';

PLAN (DESC_VARCHAR INDEX (DESC_VARCHAR_IDX))
Current memory = 983480
Delta memory = 4
Max memory = 1116260
Elapsed time= 0.01 sec
Cpu = 0.00 sec
Buffers = 75
Reads = 1143
Writes = 0
Fetches = 1143
SQL> select idobj from desc_empty where idobj = 'MONKEY';

PLAN (DESC_EMPTY INDEX (DESC_EMPTY_IDX))
Current memory = 983476
Delta memory = -4
Max memory = 1116260
Elapsed time= 0.00 sec
Cpu = 0.00 sec
Buffers = 75
Reads = 221
Writes = 0
Fetches = 221
SQL> show indexes;
----------------------------------------------------------
- show indices... just to be safe ;)
----------------------------------------------------------
DESC_EMPTY_IDX INDEX ON DESC_EMPTY(IDOBJ)
DESC_VARCHAR_IDX INDEX ON DESC_VARCHAR(IDOBJ)
SQL>

----------------------------------------------------------


Conclusion:

Baffled why a select (which could never possibly return a result) reads
~1150 pages from DESC_VARCHAR but only ~230 pages from DESC_EMPTY given that
the two indices on IDOBJ columns are virtually identical (identical data,
identical fills etc).
I can only image that the unrelated contents of a table somehow impact on
the performance of an index - no idea why.
Granted, the OS cache in Linux combined with sufficient RAM means that it's
more of a non-issue than an issue for me. I can still imagine that large
databases with a high number of concurrent users querying a random set of
values would see higher disk utilisation than they should.

Anyhow, just thought I'd share the results of the quick test I did. Hope
this helps someone.

Regards,
David

Reply via email to