[ 
https://issues.apache.org/jira/browse/DERBY-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576651#comment-13576651
 ] 

Knut Anders Hatlen commented on DERBY-5752:
-------------------------------------------

I had forgotten about this...

Now when I rerun the tests, I am not able to reproduce the big difference I saw 
in BlobClob4BlobTest in the first test run. I do still see a difference, but 
it's more like 165 seconds vs 180 seconds for the full BlobClob4BlobTest. As 
before, it looks like the entire difference is caused by 
testPositionAgressive() in an encrypted database, which slowed down from 7 
seconds to 23 seconds in my environment. There is no difference in that test 
case on unencrypted databased.

The test case in question inserts a number of CLOBs, some of which are greater 
than the 32k limit for materialization, into a table. However, the query that 
reads the CLOBs is ordered on one of the non-CLOB columns, and the sorting 
materializes all the columns in the result. It eventually scans through the 
fetched CLOBs using Clob.position().

The performance difference is seen because the java.sql.Clob objects fetched 
from the result set are no longer fully materialized in memory with the patch, 
unless they are smaller than 32k. For the big objects, this means that each 
call to Clob.position() will have to read temporary files and decrypt the 
contents in order to search for the substring. Without the patch, the entire 
value would live unencrypted in memory, which makes position() a much cheaper 
operation.

I think this is an expected difference, and that it is acceptable since the 
CLOB wasn't supposed to be materialized in this scenario in the first place. Of 
course, the current limit for materialization might not be optimal for all 
applications, as materialization indeed could improve performance of some 
operations if the system has enough memory. Increasing the limit or making it 
tunable might be a useful improvement, but it's outside the scope of this issue.
                
> LOBStreamControl should materialize less aggressively
> -----------------------------------------------------
>
>                 Key: DERBY-5752
>                 URL: https://issues.apache.org/jira/browse/DERBY-5752
>             Project: Derby
>          Issue Type: Improvement
>          Components: JDBC
>    Affects Versions: 10.9.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>         Attachments: buffsize.diff, d5752-1a.diff
>
>
> The constructor LOBStreamControl(EmbedConnection, byte[]) always makes the 
> buffer size equal to the LOB size, effectively creating an extra, fully 
> materialized copy of the LOB in memory.
> I think the assumption here is that a LOB that's already materialized is a 
> small one. That is, LOBs that are smaller than 32 KB and fit in a single page 
> are typically materialized when read from store. However, we sometimes 
> materialize LOBs that are a lot bigger than 32 KB. For example, triggers that 
> access LOBs may materialize them regardless of size (see comment in 
> DMLWriteResultSet's constructor for details). For these large LOBs, it sounds 
> unreasonable to allocate a buffer of the same size as the LOB itself.
> I'd suggest that we change the constructor so that it never allocates a 
> buffer larger than 32KB. That would mean that the behaviour is preserved for 
> all LOBs fetched directly from store (only LOBs that don't fit in a single 
> page will cause temporary files to be created), whereas we'll prevent large 
> LOBs accessed by triggers from being duplicated in memory by overflowing to 
> temporary files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to