[jira] Updated: (DERBY-3769) Make LOBStoredProcedure on the server side smarter about the read buffer size

Kristian Waagan (JIRA) Tue, 07 Oct 2008 05:16:08 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kristian Waagan updated DERBY-3769:
-----------------------------------

    Attachment: derby-3769-2a-clob_buffer_size_adjustment.diff

Patch 2a adjusts the maximum return size in characters for the CLOB stored 
procedure to 10890 (DB2_VARCHAR_MAXWIDTH / 3). This potentially results in 
anything from 10890 to 10890*3 bytes to be returned to the client in one 
round-trip, depending on the bytes per char ratio (determined by the modified 
UTF8 encoding).

Even though this fix isn't optimal, the advantages outweigh the disadvantages 
in my opinion.
I did a simple test, where I used a 32K buffer size in the client code to 
retrieve a 32M chars long CLOB consisting of CJK chars (3 bytes per char).
With the fix the it took around 17 seconds, without it took almost 3400 
seconds! In both cases a patch for DERBY-3825 was applied.
I also did a test with a 32MB CLOB containing ASCII characters, where I saw a 
performance reduction of around 3% (test run on a LAN, performance reduction 
will increase with higher latency networks).

If you want to test performance yourself, you must first apply the patch for 
DERBY-3825 (2a). The problems are described under DERBY-3766.

Patch ready for review.

> Make LOBStoredProcedure on the server side smarter about the read buffer size
> -----------------------------------------------------------------------------
>
>                 Key: DERBY-3769
>                 URL: https://issues.apache.org/jira/browse/DERBY-3769
>             Project: Derby
>          Issue Type: Improvement
>          Components: Network Server
>    Affects Versions: 10.3.3.0, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.4.2.1, 10.5.0.0
>
>         Attachments: derby-3769-1a-buffer_size_adjustment.diff, 
> derby-3769-1b-buffer_size_adjustment.diff, 
> derby-3769-2a-clob_buffer_size_adjustment.diff
>
>
> Derby has a max length for VARBINARY and VARCHAR, which is 32'672 bytes or 
> characters (see Limits.DB2_VARCHAR_MAXWIDTH).
> When working with LOBs represented by locators, using a read buffer larger 
> than the max value causes the server to process far more data than necessary.
> Say the read buffer is 33'000 bytes, and these bytes are requested by the 
> client. This requests ends up in LOBStoredProcedure.BLOBGETBYTES.
> Assume the stream position is 64'000, and this is where we want to read from. 
> The following happens:
>  a) BLOBGETBYTES instructs EmbedBlob to read 33'000 bytes, advancing the 
> stream position to 97'000.
>  b) Derby fetches/receives the 33'000 bytes, but can only send 32'672. The 
> rest of the data (328 bytes) is discarded.
>  c) The client receives the 32'672 bytes, recalculates the position and 
> length arguments and sends another request.
>  d) BLOBGETBYTES(locator, 96672, 328) is executed. EmbedBlob detects that the 
> stream position has advanced too far, so it resets the stream to position 
> zero and skips/reads until position 96'672 has been reached.
>  e) The remaining 328 bytes are sent to the client.
> This issue deals with points b) and d), by avoiding the need to reset the 
> stream.
> Points a) and e) are also problematic if a large number of bytes are going to 
> be read, say hundreds of megabytes, but that's another issue.
> It is unfortunate that using 32 K (32 * 1024) as the buffer size is almost 
> the worst case; 32'768 - 32'672 = 96 bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3769) Make LOBStoredProcedure on the server side smarter about the read buffer size

Reply via email to