[
http://issues.apache.org/jira/browse/DERBY-326?page=comments#action_12362861 ]
Kathey Marsden commented on DERBY-326:
--------------------------------------
I think this patch is a big improvement over reading the lobs into memory.
A few comments on future improvement and some questions on the code below:
Possible future improvements:
- for VARCHAR FOR BIT DATA (byte[] values) we still call writeScalarStream
with length and do not do layerBStreaming. Would it make sense to stream this
as well and get rid of all the code for handling extended length etc?
reencode() performance
This method is called every 1024 characters of a character stream and
-- creates a localBuffer char[1024] to read the character stream data
-- creates a new String(localBuffer)
-- creates a new byte[] with the getBytes() call
-- creates a new ByteArrayInputStream
A simple optimization using your current scheme might be to have global field
byte[BUFFERED_CHAR_LEN] instead of making a new one each time. Perhaps also a
single OutputStreamWriter field wrapped around a ByteArrayOutputStream which
is reset on each reencode call could eliminate the String creation.
Long term we should consider the fact that Derby stores the data in UTF8
format, getCharacterStream decodes it then network server reencodes it to UTF8.
DERBY-760 is meant to address this issue, plus hopefully eliminate the
additional copy into the buffer in DDMWriter. I will add more comments to
DERBY-760.
QUESTIONS on code:
DDMWriter - I always had a hard time understanding the stream code in this file
, but found it much easier with your single loop vs the triple loop of the old
code. But I have some questions.
- Why do we need this ensureLength call for layerBstreaming in
prepScalarStream?
ensureLength( layerBstreaming ?
DEFAULT_BUFFER_SIZE - offset :
length );
- Can you add a comment to explain the calculation for prepScalarStream return
value, especially what the 6 and 4 are?
return DssConstants.MAX_DSS_LENGTH - 6 - 4 - nullIndicatorSize -
extendedLengthByteCount
writeScalarStream
Could you verify that my analysis of this code is correct?
while( !isLastSegment ){
/*
* This loop writes and flushes the data. If there is more data
than will
* fit in the remaining buffer space or will fit in the current
DSS, the
* data will be continued in the next DSS segment via
flushScalarStreamSegment.
*
*/
int spareBufferLength = bytes.length - offset;
[snip more of writeScalarStream ]
// Adjust spareDssLength for 2 byte continuation flags
//written by flushScalarStreamSegment
if( ! isLastSegment )
spareDssLength = DssConstants.MAX_DSS_LENGTH - 2;
}
}
> Improve streaming of large objects for network server and client
> ----------------------------------------------------------------
>
> Key: DERBY-326
> URL: http://issues.apache.org/jira/browse/DERBY-326
> Project: Derby
> Type: Improvement
> Components: Network Client, Performance, Network Server
> Reporter: Kathey Marsden
> Assignee: Tomohito Nakayama
> Attachments: DERBY-326.patch, DERBY-326_2.patch, DERBY-326_3.patch,
> DERBY-326_4.patch, DERBY-326_5.patch
>
> Currently the stream writing methods in network server and client require a
> length parameter. This means that we have to get the length of the stream
> before sending it. For example in network server in EXTDTAInputStream we have
> to use getString and getbytes() instead of getCharacterStream and
> getBinaryStream so that we can get the length.
> SQLAM Level 7 provides for the enhanced LOB processing to allow streaming
> without indicating the length, so, the writeScalarStream methods in
> network server DDMWriter.java and network client Request.java can be changed
> to not require a length.
> Code inspection of these methods seems to indicate that while the length is
> never written it is used heavily in generating the DSS. One strange thing is
> that it appears on error, the stream is padded out to full length with zeros,
> but an actual exception is never sent. Basically I think perhaps these
> methods need to be rewritten from scratch based on the spec requirements for
> lobs.
> After the writeScalarStream methods have been changed, then EXTDAInputStream
> can be changed to properly stream LOBS. See TODO tags in this file for more
> info. I am guessing similar optimizations available in the client as well,
> but am not sure where that code is.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira