Hello.

My consideration in previous mail seems to be wrong.


What seems to be wrong is as next.
1:
Buffer,though size is not so large, was used when reading stream from blob in DDMWriter. Then, not using buffer may be not reason why there exists drop in performance of streaming blob.

2:
I added another buffer when reading blob.
I found the results that improvement was only a little, and not faster than before blob.
Moreover, In some cases , adding buffer will drop the performance.

3:
I profiled the execution of before and after the patch, using PerfAnal (http://java.sun.com/developer/technicalArticles/Programming/perfanal/) . I found DDMWriter.finalizeChain method consumes much time in DRDAConnThread.processCommands method after the patch and
this phenomena was not found before the patch.

I think here exists some problem ....
I will survey what is going on in DDMWriter.finalizeChain after the patch.


Best regards.



Tomohito Nakayama (JIRA) wrote:

[ http://issues.apache.org/jira/browse/DERBY-326?page=comments#action_12370505 ]
Tomohito Nakayama commented on DERBY-326:
-----------------------------------------

I have measured the performance of streaming.

Measurement was as next.
Next 3 type of test execution was measured in both of blob and clob before and 
after applying patch.

type1 std
As same as shortrun in DERBY-872

type2 stability
As same as longrun in DERBY-872

type3 volume
Larger file was used in shortrun test in DERBY-872.
File size was 2.5meg for clob and 5meg for blob.
* In this test , I found problem that test was not finished if file was larger 
than these volume.
  I have not surveyed why streaming was not finished.
  However, seeing this problem was found in both of before and after, I think I 
can ignore this phenomena for now...

Next is the result:
==> ./before/blob/std/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=207ms
==> ./before/blob/stability/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=6418ms
==> ./before/blob/volume/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=806ms

==> ./after/blob/std/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=173ms
==> ./after/blob/stability/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=6453ms
==> ./after/blob/volume/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=992ms


==> ./before/clob/std/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=1511ms
==> ./before/clob/stability/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=20968ms
==> ./before/clob/volume/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=10986ms

==> ./after/clob/std/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=594ms
==> ./after/clob/stability/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=20235ms
==> ./after/clob/volume/result.txt <==
Avg time taken to read nullrows+ (ignore first run )=1963ms


Improvement was found in all of clob streaming.
However, improvement was not found in blob streaming other than std test.
Even the little drops in performance were found in stability and volume test, 
on the contrary.


I guesses this result as next.

Before the patch, stream retrieved from blob was buffered to memory entirely.
After the patch, no buffer is used when stream of blob was read. Stream retrieved from blob was directly be read.

Not using buffer when reading stream of blob may be reason why streaming of 
blob was not improved.
In the case of clob, ReEncodedStream was used when stream of clob was read.
I think ReEncodedStream works as buffer luckily and result in improvement of 
streaming clob.

I think buffering is needed when reading stream of blob too.

Under this considerration, I will implement code of buffering *each segment* of stream retrieved from blob.
Improve streaming of large objects for network server and client
----------------------------------------------------------------

        Key: DERBY-326
        URL: http://issues.apache.org/jira/browse/DERBY-326
    Project: Derby
       Type: Improvement
 Components: Network Server, Network Client, Performance
   Reporter: Kathey Marsden
   Assignee: Tomohito Nakayama
Attachments: ClobTest.zip, DERBY-326.patch, DERBY-326_2.patch, 
DERBY-326_3.patch, DERBY-326_4.patch, DERBY-326_5.patch, 
DERBY-326_5_indented.patch, DERBY-326_6.patch, 
ReEncodedInputStream.java.modifiedForLongRun

Currently the stream writing  methods in network server and client require a  
length parameter. This means that we have to get the length of the stream 
before sending it. For example in network server in EXTDTAInputStream we have 
to use getString and getbytes() instead of getCharacterStream and 
getBinaryStream so that we can get the  length.
SQLAM Level 7 provides for the enhanced LOB processing to allow streaming 
without indicating the length, so, the writeScalarStream methods in
network server DDMWriter.java and network client Request.java can be changed to 
not require a length.
Code inspection of these methods seems to indicate that while the length is 
never written it is used heavily in generating the DSS. One strange thing is 
that it appears on error, the stream is padded out to full length with zeros, 
but an actual exception is never sent.  Basically I think perhaps these methods 
need to be rewritten from scratch based on the spec requirements for lobs.
After the writeScalarStream methods have been changed, then EXTDAInputStream 
can be changed to properly stream LOBS. See TODO tags in this file for more 
info.  I am guessing similar optimizations available in the client as well, but 
am not sure where that code is.


--
/*

       Tomohito Nakayama
       [EMAIL PROTECTED]
       [EMAIL PROTECTED]
       [EMAIL PROTECTED]

       Naka
       http://www5.ocn.ne.jp/~tomohito/TopPage.html

*/

Reply via email to