[jira] Commented: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.

Rick Hillegas (JIRA) Fri, 28 May 2010 08:11:58 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873006#action_12873006
 ]


Rick Hillegas commented on DERBY-1482:
--------------------------------------

Thanks for the patch, Mamta. If I understand correctly, users should expect to 
see the following behaviors:

1) No behavior change for legacy triggers created before 10.7 is released.

2) No behavior change for triggers created in soft-upgraded databases.

3) Potential performance improvement for triggers created in new 10.7 databases.

4) Potential performance improvement for triggers created in legacy databases 
after hard-upgrade to 10.7.

Before looking into the details of this patch, I would like to explore an 
alternative solution. Maybe this solution has already been considered and 
rejected. If so, I apologize for the noise. This alternative approach would 
bring the performance improvement to more cases and would avoid the 
soft-upgrade and serialization issues. I think that it would re-use most of the 
code which you are supplying with the current patch:

A) Do not change what is stored in SYSTRIGGERS.

B) Instead, the very first time that a trigger is run, if there is a 
REFERENCING clause, re-parse the trigger text in order to find the columns that 
are actually needed.

C) Store the extra referenced column information in a transient field of the 
trigger descriptor for use by later firings.

The disadvantage of this approach is that the first firing of a trigger would 
incur an extra compilation tax. I think that this tax would not be noticed.

The advantage of this approach is that the performance improvement would be 
seen in cases (1) and (2) above and not just in cases (3) and (4). In addition, 
we would avoid the tricky serialization incompatibilities.

Thanks,
-Rick

> Update triggers on tables with blob columns stream blobs into memory even 
> when the blobs are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Daniel John Debrunner
>            Assignee: Mamta A. Satoor
>            Priority: Minor
>         Attachments: derby1482_patch1_diff.txt, derby1482_patch1_stat.txt, 
> derby1482_patch2_diff.txt, derby1482_patch2_stat.txt, 
> derby1482_patch3_diff.txt, derby1482_patch3_stat.txt, 
> derby1482DeepCopyAfterTriggerOnLobColumn.java, derby1482Repro.java, 
> derby1482ReproVersion2.java, junitUpgradeTestFailureWithPatch1.out, 
> TriggerTests_ver1_diff.txt, TriggerTests_ver1_stat.txt
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger 
> "tr1" defined on that table, where the triggered-SQL-action for "tr1" does 
> NOT reference any of the blob columns in the table. [ Note that this is 
> different from DERBY-438 because DERBY-438 deals with triggers that _do_ 
> reference the blob column(s), whereas this issue deals with triggers that do 
> _not_ reference the blob columns--but I think they're related, so I'm 
> creating this as subtask to 438 ]. In such a case, if the trigger is fired, 
> the blob data will be streamed into memory and thus consume JVM heap, even 
> though it (the blob data) is never actually referenced/accessed by the 
> trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row 
> for each row mode db2sql update t2 set updated = updated + 1 where t2.id = 
> n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be 
> streamed into memory for each row affected by the trigger. The result is 
> that, if the blob data is large, we end up using a lot of JVM memory when we 
> really shouldn't have to (at least, in _theory_ we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, 
> and avoid streaming the lob into memory whenever possible (hence this is 
> probably more of an "enhancement" request than a bug)... 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.

Reply via email to