I have always wondered about this, but now I am encountering a serious 
performance problem which results from this policy.

When executing a Query, OJB generates two different rounds of SQL.  The 
first round is used just to determine the Identity of any matching 
objects.  These Identities are then checked against the cache, and the 
database is queried again to materialize any missing objects.

Both the first round and the second retrieve all of the known columns, 
even though the first one only needs the PKs.

My situation: I have a table with about 20 columns, including some large 
text columns.  I'm doing a query which ends up joining to a dozen other 
tables, and I have to use 'distinct'.  Before applying 'distinct', I have 
millions of rows in my result set; afterwards, I have about 700.  If I 
query just for the PK, the query takes a few seconds to return the 700 PK 
values.  If I query for all of the columns, it takes 3 minutes to 
distinct-ify the millions of duplicate rows down to the 700 actually 
distinct ones.

Does anybody have a solution?  Why does the initial query have to retrieve 
all of the columns instead of just the PK(s)?

thanks,
-steve

Steve Clark
ECOS Development Group
[EMAIL PROTECTED]
(970)226-9291


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to