Dan Karp wrote:
I am working with Derby 10.2.2. This SQL query has an unacceptable
performance:
SELECT * FROM registryobject WHERE id IN (SELECT targetObject FROM
association WHERE UPPER(sourceobject) = UPPER(?))
https://issues.apache.org/jira/browse/DERBY-47
Fixed in 10.3.0.0.
Unfortunately, I don't think DERBY-47 is at play here. The query in question
has a subquery within an IN clause, while DERBY-47 added an optimization that
only applies to IN lists with constants and/or parameters. So my guess is that
the DERBY-47 fix will not help for this particular query.
> Is this a bug in derby or something else? Am I missing something?
I think it's "something else": i.e. an opportunity for improved performance of
subqueries in a WHERE clause. Upon quick inspection it looks like this is the
same issue that is described in DERBY-2231--at least, based on the
"testcase1.sql" script attached to that issue. The problem is that, instead of
executing the subquery once and then using the result in the IN clause, Derby
executes the subquery for every row of the outer query, which is really expensive.
> Now, if I break this query into two:
>
> SELECT targetObject FROM association WHERE UPPER(sourceobject) = UPPER(?)
>
> SELECT * FROM registryobject WHERE id IN ('targetObject.id1',
> 'targetObject.id2', ...)
>
> I get acceptable performance.
With this split query approach, you're only executing the subquery one
time--thus you should indeed see better performance. And further, if the second
query uses parameters for the values in the IN list, then that specific query
could potentially benefit from the DERBY-47 changes. But the original query
with the subquery is, I think, a different problem (DERBY-2231).
Army