Re: Tuncation of trailing blanks and lengthless streaming overloads

Daniel John Debrunner Thu, 22 Jun 2006 06:49:23 -0700

Kristian Waagan wrote:

> Hello,
> 
> I'm working on DERBY-1417; adding new lengthless overloads to the
> streaming API.  So far, I have only been looking at implementing this in
> the embedded driver. Based on some comments in the code, I have a few
> questions and observations regarding truncation of trailing blanks in
> the various character data types.
> 
> Type            Trail. blank trunc.     Where
> ====================================================================
> CHAR                allowed             SQLChar.normalize
> VARCHAR             allowed             SQLVarchar.normalize
> LONG VARCHAR       disallowed           SQLLongVarchar.normalize
> CLOB                allowed             streaming or
>                                         SQLVarchar.normalize, depending
>                                         on source.
> 
> As can be seen, only data for CLOB is truncated for trailing blanks in
> the streaming class. We must still read all the data, or so much as we
> need to know the insertion will fail, but we don't have to store it all
> in memory.
> 
> Truncation of trailing blanks is not allowed at all for LONG VARCHAR
> (according to code comments and bug 5592 - haven't found the place this
> is stated in the specs).
> 
> My question is, should we do the truncate check for CHAR and VARCHAR on
> the streaming level as well?
> If we don't add this feature, inserting a
> 10GB file into a VARCHAR field by mistake will cause 10GB to be loaded
> into memory even though the max size allowed is ~32K, possibly causing
> out-of-memory errors. The error could be generated at an earlier stage
> (possibly after reading ~32K +1 bytes).


I would say its a separate issue to the one you are addressing.
Applications most likely won't be inserting 10Gb values into
CHAR/VARCHAR columns using streams as it's not going to work.
Maybe enter a bug, but doesn't seem it has to be fixed as part of this
issue.

> As far as I can tell, adding this feature is a matter of modifying the
> 'ReaderToUTF8' class and the
> 'EmbedPrearedStatement.setCharacterStreamInternal' method.
> One could also optimize the reading of data into LONG VARCHAR, where one
> would abort the reading as soon as you can instead of taking it all into
> memory first. This would require some special case handling in the
> mentioned locations.
> 
> 
> Another matter is that streams will not be checked for exact length
> match when using the lengthless overloads, as we don't have a specified
> length to compare against.
> I have a preliminary implementation for setAsciiStream,
> setCharacterStream  and setClob (all without length specifications) in
> EmbedPreparedStatement.

What's the issue with setClob()? The current method doesn't take a
length. Has java.sql.Clob been changed to not always have a length?

Dan.

Re: Tuncation of trailing blanks and lengthless streaming overloads

Reply via email to