[ 
http://issues.apache.org/jira/browse/DERBY-326?page=comments#action_12362861 ] 

Kathey Marsden commented on DERBY-326:
--------------------------------------

I think this patch is a big improvement over reading the lobs into memory.
A few comments on future improvement and some questions on the code below:


Possible future improvements:

- for VARCHAR FOR BIT DATA (byte[] values)  we still call writeScalarStream 
with length and do not do layerBStreaming.  Would it make sense to stream this 
as well and get rid of all the code for handling extended length etc?

reencode() performance 
This method is called every 1024 characters of a character stream and 
-- creates a localBuffer char[1024] to read the character stream data
-- creates a new String(localBuffer) 
-- creates a new byte[] with the getBytes() call
-- creates a new ByteArrayInputStream

A simple optimization using your  current scheme might be to have global field  
byte[BUFFERED_CHAR_LEN] instead of making a new one  each time.  Perhaps also a 
single  OutputStreamWriter field wrapped around a ByteArrayOutputStream  which 
is reset on each reencode call could eliminate the String creation.

Long term we should consider the fact that Derby stores the data in UTF8 
format, getCharacterStream decodes it then network server reencodes it to UTF8.
DERBY-760 is meant to address this issue, plus hopefully eliminate the 
additional copy into the buffer in  DDMWriter.  I will add more comments to 
DERBY-760.



QUESTIONS on code:

DDMWriter - I always had a hard time understanding the stream code in this file 
, but found it much easier with your single loop vs the triple loop of the old 
code.  But I have some questions.

- Why do we need this ensureLength call for layerBstreaming in 
prepScalarStream? 
      ensureLength( layerBstreaming ? 
                    DEFAULT_BUFFER_SIZE - offset : 
                    length );

- Can you add a comment to explain the calculation for prepScalarStream return 
value, especially what the 6 and 4 are?
return DssConstants.MAX_DSS_LENGTH - 6 - 4 - nullIndicatorSize  - 
extendedLengthByteCount


writeScalarStream

Could you verify that my analysis of this code  is correct?
        while( !isLastSegment ){
                /*
                 * This loop writes and flushes the data. If there is more data 
than will
                 * fit in the remaining buffer space or will fit in the current 
DSS, the 
                 * data will be continued in the next DSS segment via 
flushScalarStreamSegment.
                 * 
                 */
                    int spareBufferLength = bytes.length - offset;
                [snip more of writeScalarStream ]
            
                        
                        

                        // Adjust spareDssLength for 2 byte continuation flags
                        //written by flushScalarStreamSegment
                        if( ! isLastSegment )
                            spareDssLength = DssConstants.MAX_DSS_LENGTH - 2;

                        }
                    
                }               

> Improve streaming of large objects for network server and client
> ----------------------------------------------------------------
>
>          Key: DERBY-326
>          URL: http://issues.apache.org/jira/browse/DERBY-326
>      Project: Derby
>         Type: Improvement
>   Components: Network Client, Performance, Network Server
>     Reporter: Kathey Marsden
>     Assignee: Tomohito Nakayama
>  Attachments: DERBY-326.patch, DERBY-326_2.patch, DERBY-326_3.patch, 
> DERBY-326_4.patch, DERBY-326_5.patch
>
> Currently the stream writing  methods in network server and client require a  
> length parameter. This means that we have to get the length of the stream 
> before sending it. For example in network server in EXTDTAInputStream we have 
> to use getString and getbytes() instead of getCharacterStream and 
> getBinaryStream so that we can get the  length.
> SQLAM Level 7 provides for the enhanced LOB processing to allow streaming 
> without indicating the length, so, the writeScalarStream methods in
> network server DDMWriter.java and network client Request.java can be changed 
> to not require a length.
> Code inspection of these methods seems to indicate that while the length is 
> never written it is used heavily in generating the DSS. One strange thing is 
> that it appears on error, the stream is padded out to full length with zeros, 
> but an actual exception is never sent.  Basically I think perhaps these 
> methods need to be rewritten from scratch based on the spec requirements for 
> lobs.
> After the writeScalarStream methods have been changed, then EXTDAInputStream 
> can be changed to properly stream LOBS. See TODO tags in this file for more 
> info.  I am guessing similar optimizations available in the client as well, 
> but am not sure where that code is.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to