Kristian Waagan wrote: > Hello, > > I'm working on DERBY-1417; adding new lengthless overloads to the > streaming API. So far, I have only been looking at implementing this in > the embedded driver. Based on some comments in the code, I have a few > questions and observations regarding truncation of trailing blanks in > the various character data types. > > Type Trail. blank trunc. Where > ==================================================================== > CHAR allowed SQLChar.normalize > VARCHAR allowed SQLVarchar.normalize > LONG VARCHAR disallowed SQLLongVarchar.normalize > CLOB allowed streaming or > SQLVarchar.normalize, depending > on source. > > As can be seen, only data for CLOB is truncated for trailing blanks in > the streaming class. We must still read all the data, or so much as we > need to know the insertion will fail, but we don't have to store it all > in memory. > > Truncation of trailing blanks is not allowed at all for LONG VARCHAR > (according to code comments and bug 5592 - haven't found the place this > is stated in the specs). > > My question is, should we do the truncate check for CHAR and VARCHAR on > the streaming level as well? > If we don't add this feature, inserting a > 10GB file into a VARCHAR field by mistake will cause 10GB to be loaded > into memory even though the max size allowed is ~32K, possibly causing > out-of-memory errors. The error could be generated at an earlier stage > (possibly after reading ~32K +1 bytes).
I would say its a separate issue to the one you are addressing. Applications most likely won't be inserting 10Gb values into CHAR/VARCHAR columns using streams as it's not going to work. Maybe enter a bug, but doesn't seem it has to be fixed as part of this issue. > As far as I can tell, adding this feature is a matter of modifying the > 'ReaderToUTF8' class and the > 'EmbedPrearedStatement.setCharacterStreamInternal' method. > One could also optimize the reading of data into LONG VARCHAR, where one > would abort the reading as soon as you can instead of taking it all into > memory first. This would require some special case handling in the > mentioned locations. > > > Another matter is that streams will not be checked for exact length > match when using the lengthless overloads, as we don't have a specified > length to compare against. > I have a preliminary implementation for setAsciiStream, > setCharacterStream and setClob (all without length specifications) in > EmbedPreparedStatement. What's the issue with setClob()? The current method doesn't take a length. Has java.sql.Clob been changed to not always have a length? Dan.
