Dan Karp wrote:
I am working with Derby 10.2.2.  This SQL query has an unacceptable
performance:

SELECT * FROM registryobject WHERE id IN (SELECT targetObject FROM
association WHERE UPPER(sourceobject) = UPPER(?))

https://issues.apache.org/jira/browse/DERBY-47

Fixed in 10.3.0.0.

Unfortunately, I don't think DERBY-47 is at play here. The query in question has a subquery within an IN clause, while DERBY-47 added an optimization that only applies to IN lists with constants and/or parameters. So my guess is that the DERBY-47 fix will not help for this particular query.

> Is this a bug in derby or something else?  Am I missing something?

I think it's "something else": i.e. an opportunity for improved performance of subqueries in a WHERE clause. Upon quick inspection it looks like this is the same issue that is described in DERBY-2231--at least, based on the "testcase1.sql" script attached to that issue. The problem is that, instead of executing the subquery once and then using the result in the IN clause, Derby executes the subquery for every row of the outer query, which is really expensive.

> Now, if I break this query into two:
>
> SELECT targetObject FROM association WHERE UPPER(sourceobject) = UPPER(?)
>
> SELECT * FROM registryobject WHERE id IN ('targetObject.id1',
> 'targetObject.id2', ...)
>
> I get acceptable performance.

With this split query approach, you're only executing the subquery one time--thus you should indeed see better performance. And further, if the second query uses parameters for the values in the IN list, then that specific query could potentially benefit from the DERBY-47 changes. But the original query with the subquery is, I think, a different problem (DERBY-2231).

Army

Reply via email to