Werner,
form_data, page_variable,
cookie, request_header and response_header are all dependent on
web_resource;
web_resource 1:m
form_data
web_resource 1:m
page_variable
web_resource 1:m
cookie
web_resource 1:m
request_header
web_resource 1:m
response_header
So castor will construct a joined query.
SELECT ......
FROM web_resource a1
LEFT OUTER JOIN form_data ON (a1.web_resource_id=form_data.web_resource_id),web_resource a2
LEFT OUTER JOIN cookie ON (a2.web_resource_id=cookie.web_resource_id),web_resource a3
LEFT OUTER JOIN page_variable ON (a3.web_resource_id=page_variable.web_resource_id),web_resource a4
LEFT OUTER JOIN response_header ON (a4.web_resource_id=response_header.web_resource_id),web_resource a5
LEFT OUTER JOIN request_header ON (a5.web_resource_id=request_header.web_resource_id)
WHERE a2.web_resource_id=a4.web_resource_id
AND a2.web_resource_id=a1.web_resource_id
AND a2.web_resource_id=a5.web_resource_id
AND a2.web_resource_id=a3.web_resource_id
AND a2.web_resource_id = ?
FROM web_resource a1
LEFT OUTER JOIN form_data ON (a1.web_resource_id=form_data.web_resource_id),web_resource a2
LEFT OUTER JOIN cookie ON (a2.web_resource_id=cookie.web_resource_id),web_resource a3
LEFT OUTER JOIN page_variable ON (a3.web_resource_id=page_variable.web_resource_id),web_resource a4
LEFT OUTER JOIN response_header ON (a4.web_resource_id=response_header.web_resource_id),web_resource a5
LEFT OUTER JOIN request_header ON (a5.web_resource_id=request_header.web_resource_id)
WHERE a2.web_resource_id=a4.web_resource_id
AND a2.web_resource_id=a1.web_resource_id
AND a2.web_resource_id=a5.web_resource_id
AND a2.web_resource_id=a3.web_resource_id
AND a2.web_resource_id = ?
The size of the result set is basically the
cartesian product of all the dependent objects.
result size = form_data * cookie *
page_variable * response_header * request_header =
O(n^5)
Castor should probably use separate queries to load
the dependent objects.
---load itself
select *
from web_resource
where web_resource_id = ?
--load dependent objects
select *
from cookie
where web_resource_id = ?
select *
from page_variable
where web_resource_id =
?
:
:
The result size = 1 + form_data +
cookie + page_variable + response_header + request_header =
O(5n)
Suggested implementation:
----------------------------------------------------------------------------------------------------------------------------------------------------
Change class org.exolab.castor.persist.spi.QueryExpression;
Modify this class to be a linked list of QueryExpresion, add a
nextExpression field.
The first _expression_ is the _expression_ for the master object and subsequent
expressions are for dependent objects
[master]---->[depend1]----.[depend2]---->[depend3]--->null
where depend1, depend2, and depend3 are child dependent objects of
master.
----------------------------------------------------------------------------------------------------------------------------------------------------
Change org.exolab.castor.jdo.engine.SQLEngine.buildFinder()
method.
Modify this method to build the linked list QueryExpression instead of the
joined
_expression_.
---------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------
I believe the suggested approach will have a significant performance
improvement for objects that have a large number of dependent objects.
I would like to know your thoughts.
Steve
----- Original Message -----
From: "Werner Guttmann" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, June 28, 2004 3:51 PM
Subject: Re: [castor-dev] JDO object creation
performance flaw
> Stephen,
>
> How are the relations between web_resource and all the other classes exactly defined ? Are there DEPEND relations, indeed, or plain 1:M relations ?
>
> Werner
>
> --Original Message Text---
> From: Stephen Ince
> Date: Sat, 26 Jun 2004 13:06:54 -0400
>
>
> I think that the way castor constructs the select query for object population has a performance flaw.
>
> In the example below form_data, cookie, page_variable, response_header and request_header are all childs object of web_resource.
> The query will take conciderable long if there is data for the dependent objects. Basically there is a lot redundant information that being sent back to
> castor.
>
>
> SELECT
> a2.web_resource_id,a2.parent_web_resource_id,a2.user_scenario_id,a2.name,a2.url,a2.response_page_size,a2.request_think_time,a2.response_pag
> e_time,a2.page_number,a2.payload,a2.request_content_type,a2.soap_action,a2.response_content_type,a2.accept_language,a2.method,a2.user_nam
> e,a2.response_result_code,a2.protocol,a2.response_valid_expr,a2.response_valid_type,a2.response_error_expr,a2.response_error_type,a2.passwd,a2.
> auth_type,form_data.form_data_id,cookie.id,page_variable.id,response_header.id,request_header.id,a2.retry,a2.page_timeout
>
> FROM web_resource a1
> LEFT OUTER JOIN form_data ON (a1.web_resource_id=form_data.web_resource_id),web_resource a2
> LEFT OUTER JOIN cookie ON (a2.web_resource_id=cookie.web_resource_id),web_resource a3
> LEFT OUTER JOIN page_variable ON (a3.web_resource_id=page_variable.web_resource_id),web_resource a4
> LEFT OUTER JOIN response_header ON (a4.web_resource_id=response_header.web_resource_id),web_resource a5
> LEFT OUTER JOIN request_header ON (a5.web_resource_id=request_header.web_resource_id)
> WHERE a2.web_resource_id=a4.web_resource_id
> AND a2.web_resource_id=a1.web_resource_id
> AND a2.web_resource_id=a5.web_resource_id
> AND a2.web_resource_id=a3.web_resource_id
> AND a2.user_scenario_id = 1
>
>
> A more efficient query would be:
>
> SELECT a2.web_resource_id
> FROM web_resource a1
> LEFT OUTER JOIN form_data ON (a1.web_resource_id=form_data.web_resource_id),web_resource a2
> LEFT OUTER JOIN cookie ON (a2.web_resource_id=cookie.web_resource_id),web_resource a3
> LEFT OUTER JOIN page_variable ON (a3.web_resource_id=page_variable.web_resource_id),web_resource a4
> LEFT OUTER JOIN response_header ON (a4.web_resource_id=response_header.web_resource_id),web_resource a5
> LEFT OUTER JOIN request_header ON (a5.web_resource_id=request_header.web_resource_id)
> WHERE a2.web_resource_id=a4.web_resource_id
> AND a2.web_resource_id=a1.web_resource_id
> AND a2.web_resource_id=a5.web_resource_id
> AND a2.web_resource_id=a3.web_resource_id
> AND a2.user_scenario_id = 1
>
>
>
> Is it possible to have castor lazy load the fields for parent object?
>
>
> Steve
>
>
>
> -----------------------------------------------------------
> If you wish to unsubscribe from this mailing, send mail to
> [EMAIL PROTECTED] with a subject of:
> unsubscribe castor-dev
>
-----------------------------------------------------------
If you wish to unsubscribe from this mailing, send mail to
[EMAIL PROTECTED] with a subject of:
unsubscribe castor-dev
