>Hello,
>
> thanks Set, makes my test unnecessary!
>
> But I'm thinking a little bit about my original problem.
>
> Would it make sense to add a tracker entry for optimization of
> subselects without reference to outer query?
>
> I think that they should get evaluated and transformed to something
> like a "or" connected list of simple compares.
>
> So
> delete from test where Id in (
> select min(t.Id) FROM test t
> group by t.reference, t.key
> having count(*) > 1
> )
>
> could be transformed by the engine to
>
> delete from test where Id = 5 or Id = 6 or Id = 89
>
> in case of
>
> select min(t.Id) FROM test t
> group by t.reference, t.key
> having count(*) > 1
>
> would return 5,6,89
>
> I've found several similar questions in my mail archive so I think
> I'm not the only one with such a problem.
>
> A common use case would be to remove duplicate entries for
> example to create a unique index.
>
> Any comments?
My guess is that it is too early to add this to Firebird. Sure, there are
situations where a non-correlated subselect would be benefitial, but how would
Firebird know when? In your case (since you're deleting from the same table as
you're selecting from), I agree that this could be simple. However, normally
the select would be against a different table and then I think things are not
as simple. Assuming we were talking about two different tables: I don't know
how Firebird translates a query like yours into EXISTS, but theoretically it
could be something like:
delete from DeleteFromTable t
where exists(select * from test t2
left join test t3
on t2.reference = t3.reference
and t2.key = t3.key
and t2.id > t3.id
where t.id = t2.id
and t3.id is null)
If DeleteFromTable was a smallish table, and test was large and sensible
indexes were in place, then my rewritten query would be quicker if your 'min
subquery' returned a lot of rows. On the other hand, if your subquery returned
few rows, then executing the subselect like you suggest would indeed be
quicker. If you add a WHERE clause to your subselect, it is even thinkable
that, say, WHERE MyField = 1 would be quicker using your solution, whereas
WHERE MyField = 2 would be quicker using mine. And the only way for Firebird to
know, would be to have histograms. Histograms was removed from things to be
implemented in version 3 (I think it was announced during the conference, at
least I'm certain that I've heard this), I hope they will appear in Firebird 4.
I would think that once histograms are in place, cases like yours are amongst
those that can be considered for optimization (although I don't know anything
about the source of Firebird).
You are definitely not the only one with this kind of problem, I would
appreciate it myself if it was possible for deletes and updates to use indexes
similar to what is possible with selects (e.g. allowing for JOIN). However, my
knowledge of SQL and Firebird is not good enough to formulate how to do this in
a sensible way. Also, I think that as users of Firebird it is possible to
rewrite our slow queries to more performant ones in most cases, so to me this
problem is not all too big.
I have never needed expression indexes myself, but have you looked into writing
one to see if this can make your query execute (considerably) faster?
Sorry for not agreeing with you in that this is a useful optimization just yet,
if you're lucky people knowing something about the source code of Firebird will
disagree with me.
Set