Just to make sure, I've prepared a simple experiment to measure if 
sub-selects in the EXISTS clause can behave differently depending on 
what and how you select in the sub-query.
If you haven't been following this conversation, please skip the rest of 
this letter as it won't probably mean much to you.

Experiment conditions.

Two tables:

create table t1 (
     f1 integer not null,
     longfield1 char(200) default 'ABCDEFGHIJ',
     primary key(f1)
);

create table t2 (
     f2 integer not null,
     longfield2 char(200) default '0123456789',
     primary key(f2)
);

The longfieldxx is added to check if selecting it in the sub-select 
matters or not. Char(200) for a table with a few million records should 
make a visible difference.

Three flavors of update with EXISTS clause were measured:

[1] (the original query, from Svein's post if I remember correctly)
update t1 a set longfield1='x'
where exists (
     select * from t2 b where a.f1 = b.f2
);

[2] (an improvement suggested by Tom)
update t1 a set longfield1='x'
where exists (
     select 1 from t2 b where a.f1 = b.f2
);

[3] (a further improvement suggested by me)
update t1 a set longfield1='x'
where exists (
     select first(1) 1 from t2 b where a.f1 = b.f2
);

The queries [1]-[3] were executed three times each. After each execution 
the transaction was rolled back.

The measurement was made for two different configurations of the 
contents of t1 and t2.

[A]

Both t1 and t2 consisted of 2 million records. t1.f1 ranged from 1 to 
2000000 and t2.f2 ranged from 2000001 to 4000000. It is easy to see, 
that the intersection of t1 and t2 was empty, so the sub-selects in 
[1]-[3] were empty.
Execution times (three executions for each query):
[1]     11.6s           11.7s           11.6s
[2]     11.6s           11.6s           11.6s
[3]     12.6s           12.6s           12.5s

[B]

Both t1 and t2 consisted of 2 million records, and both t1.f1 and t2.f2 
ranged from 1 to 2000000, so this time the intersection of t1.f1 and 
t2.f2 was maximal, i.e. consisted of all records.
Execution times (three executions for each query):
[1]     165s            162s            163s
[2]     162s            162s            162s
[3]     165s            165s            169s

Conclusion: using select 1 ... instead of select * ... in the sub-select 
doesn't improve anything, while using select first(1) ... even slows 
things down a bit. IMHO, that proves FB handles the sub-selects in an 
efficient manner, i.e. doesn't retrieve irrelevant data and exits the 
sub-select as soon as the first record arrives.

Sorry for this rather long post, but I wanted to share the results so 
that some other people don't have to reinvent the wheel.

regards
Tomasz

-- 
__--==============================--__
__--==     Tomasz Tyrakowski    ==--__
__--==        SOL-SYSTEM        ==--__
__--== http://www.sol-system.pl ==--__
__--==============================--__

Reply via email to