[
https://issues.apache.org/jira/browse/ODE-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karthick Sankarachary updated ODE-694:
--------------------------------------
Attachment: migration-step-c-delete-large-data.sql
migration-step-b-copy-large-data.sql
migration-step-a-upgrade-schema.sql
The upgrade path, including the scripts attached, need to be release noted.
> De-Normalizing Large Data
> -------------------------
>
> Key: ODE-694
> URL: https://issues.apache.org/jira/browse/ODE-694
> Project: ODE
> Issue Type: Improvement
> Components: Axis2 Integration, BPEL Runtime
> Affects Versions: 1.3.3
> Reporter: Karthick Sankarachary
> Assignee: Karthick Sankarachary
> Fix For: 1.3.4
>
> Attachments: denormalizing-large-data.patch,
> migration-step-a-upgrade-schema.sql, migration-step-b-copy-large-data.sql,
> migration-step-c-delete-large-data.sql
>
>
> Currently, in the hibernate implementation of the process data access object
> (DAO) interface, all of the large (read blob) values is stored not in the
> table where it belongs, but rather in a detached table called LARGE_DATA.
> Examples of such dependent tables include those that hold the state of BPEL
> instances, BPEL events, SOAP messages, WSDL partner links, and XML variables,
> among other things. Inevitably, the LARGE_DATA table ends up becoming the
> bottleneck, because it forces us to not only execute a large number of joins
> but also hold that many more locks. As a result, the (hibernate) DAO layer
> takes longer to read/write/delete process data, and may potentially deadlock
> on the LARGE_DATA table.
> The obvious way out of this mess is to move the blob column from the
> LARGE_DATA table to the table where it is currently referenced through a
> foreign key. However, care must be taken to migrate the schema and data of
> existing servers at the time of upgrade. The upgrade path is described below,
> where the dependent table refers to the table that currently has a foreign
> key reference into the parent (i.e., LARGE_DATA) table:
> a) For each such foreign key in the dependent table, add the corresponding
> blob column(s) in the dependent table.
> b) For each such foreign key in the dependent table, copy the blob value from
> the corresponding row of the parent into the corresponding column of the
> dependent that was added in step (a).
> c) Drop the foreign keys in the dependent table that refer to the LARGE_DATA
> table, and the LARGE_DATA table itself. Finally, increment the version of the
> ODE schema (to indicate that the schema has been changed).
> Needless to say, we must be prepared for scenarios wherein the server was
> upgraded but the schema wasn't (for whatever reason). We do so by checking
> the ODE schema version at the time of server startup, and failing gracefully
> if it doesn't match the expected value. Note that we consciously chose not to
> automate the upgrade path as part of the migration handler, primarily due to
> the long-running nature of the transaction.
> As a result of this change, we observed a significant improvement in the
> performance of the hibernate-based process server (between 30-40%). However,
> individual results may vary.
> Note that the downside to moving the blob column into the dependent table is
> that we may inadvertently end up reading the blob property as a side-effect
> of an unrelated query on that table. As you may have guessed, that was the
> motivation for introducing the LARGE_DATA table in the first place.
> Fortunately, there are ways to mitigate against such cases, which include (a)
> using lazy fetching of the blob properties in the problematic dependent
> table, or (b) re-introducing a large data table specifically for the
> problematic dependent table, and using join fetching to work around the N+1
> select problem. We plan on implementing such optimizations on a case-by-case
> basis, if and when required.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.