[
https://issues.apache.org/jira/browse/DERBY-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576651#comment-13576651
]
Knut Anders Hatlen commented on DERBY-5752:
-------------------------------------------
I had forgotten about this...
Now when I rerun the tests, I am not able to reproduce the big difference I saw
in BlobClob4BlobTest in the first test run. I do still see a difference, but
it's more like 165 seconds vs 180 seconds for the full BlobClob4BlobTest. As
before, it looks like the entire difference is caused by
testPositionAgressive() in an encrypted database, which slowed down from 7
seconds to 23 seconds in my environment. There is no difference in that test
case on unencrypted databased.
The test case in question inserts a number of CLOBs, some of which are greater
than the 32k limit for materialization, into a table. However, the query that
reads the CLOBs is ordered on one of the non-CLOB columns, and the sorting
materializes all the columns in the result. It eventually scans through the
fetched CLOBs using Clob.position().
The performance difference is seen because the java.sql.Clob objects fetched
from the result set are no longer fully materialized in memory with the patch,
unless they are smaller than 32k. For the big objects, this means that each
call to Clob.position() will have to read temporary files and decrypt the
contents in order to search for the substring. Without the patch, the entire
value would live unencrypted in memory, which makes position() a much cheaper
operation.
I think this is an expected difference, and that it is acceptable since the
CLOB wasn't supposed to be materialized in this scenario in the first place. Of
course, the current limit for materialization might not be optimal for all
applications, as materialization indeed could improve performance of some
operations if the system has enough memory. Increasing the limit or making it
tunable might be a useful improvement, but it's outside the scope of this issue.
> LOBStreamControl should materialize less aggressively
> -----------------------------------------------------
>
> Key: DERBY-5752
> URL: https://issues.apache.org/jira/browse/DERBY-5752
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.9.1.0
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Attachments: buffsize.diff, d5752-1a.diff
>
>
> The constructor LOBStreamControl(EmbedConnection, byte[]) always makes the
> buffer size equal to the LOB size, effectively creating an extra, fully
> materialized copy of the LOB in memory.
> I think the assumption here is that a LOB that's already materialized is a
> small one. That is, LOBs that are smaller than 32 KB and fit in a single page
> are typically materialized when read from store. However, we sometimes
> materialize LOBs that are a lot bigger than 32 KB. For example, triggers that
> access LOBs may materialize them regardless of size (see comment in
> DMLWriteResultSet's constructor for details). For these large LOBs, it sounds
> unreasonable to allocate a buffer of the same size as the LOB itself.
> I'd suggest that we change the constructor so that it never allocates a
> buffer larger than 32KB. That would mean that the behaviour is preserved for
> all LOBs fetched directly from store (only LOBs that don't fit in a single
> page will cause temporary files to be created), whereas we'll prevent large
> LOBs accessed by triggers from being duplicated in memory by overflowing to
> temporary files.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira