2007/5/4, Army <[EMAIL PROTECTED]>:
Paulo Jesus wrote:
>
> I have this query
>
> SELECT J.i_id,J.i_thumbnail from item Jdb-derby-10.2.2.0-bin t where J.i_id
in ((SELECT
> i_related1 FROM item WHERE i_id=3556) union (SELECT i_related2 FROM
> item WHERE i_id=3556) union (SELECT i_related3 FROM item WHERE
> i_id=3556) union (SELECT i_related4 FROM item WHERE i_id=3556) union
> (SELECT i_related5 FROM item WHERE i_id=3556));
>
> ... does anybody see any other possibility on rewriting this query
> besides those presented below ?
Can you indicate which version of Derby you are using?
I'm using a derby network framework version 10.2.2.0
Does Derby use the index for the rewrites that you show below?
I can't say now. The rewrites suggestion had been from the person
witch I'm working it.
> rs = SELECT i_related1,i_related2,i_related3,i_related4 FROM item
> WHERE i_id=3556
> SELECT J.i_id,J.i_thumbnail from item J where J.i_id in (rs)
>
> or
>
> tmp_table = SELECT i_related1 as col FROM item WHERE i_id=3556 union
> SELECT i_related2 as col FROM item WHERE i_id=3556 ...
> SELECT J.i_id,J.i_thumbnail from item, tmp_table where i_id = col;
> (Column i_id has a unique index thus guaranteeing that the inner joint
> does not change the result set cardinality.)
If the above queries lead to the desired behavior (i.e. Derby uses the index),
then you might want to try something like:
SELECT J.i_id, J.i_thumbnail FROM
item J,
(SELECT i_related1 as col FROM item WHERE i_id=3556
UNION SELECT i_related2 FROM item WHERE i_id=3556
UNION SELECT i_related3 FROM item WHERE i_id=3556
UNION SELECT i_related4 FROM item WHERE i_id=3556
UNION SELECT i_related5 FROM item WHERE i_id=3556
) X (id)
WHERE (J.i_id = X.id)
This is similar to the "tmp_table" approach, except that you are putting the
"temp" query directly into the FROM list instead of using a separate statement.
With this statement the optimizer may attempt to materialize the ITEM table
into memory and then do a hash join with it. If you don't want that then you
can use optimizer directives to force use of the index and/or nested loop join
(assuming it doesn't choose that already). For ex:
SELECT J.i_id, J.i_thumbnail FROM
item J --DERBY-PROPERTIES index=I_ID
, (SELECT i_related1 as col FROM item WHERE i_id=3556
UNION SELECT i_related2 FROM item WHERE i_id=3556
UNION SELECT i_related3 FROM item WHERE i_id=3556
UNION SELECT i_related4 FROM item WHERE i_id=3556
UNION SELECT i_related5 FROM item WHERE i_id=3556
) X (id)
WHERE (J.i_id = X.id)
Your suggestion work without forcing optimizer.
But this optimizer directive will be very useful.
That said, though, I wonder if this type of query is an indication that the
database schema should be re-examined? I.e. would it make sense to store the
"related" ids in a separate table somehow so that you can just join directly
with them?
This is from TPC-W test. We are trying to understand the performance
problems in derby.
PJ
Army