[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-28 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Derby Info:   (was: [Patch Available])

Committed patch 7a3 to trunk with revision 738408.

I think some code in SQLChar can be removed now, but I'll wait a little while 
before I start looking at it.
As noted earlier, databases created with revision 738408 or later containing 
Clobs cannot be used with earlier revisions any more.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.stat, 
> derby-3907-7a1-write_new_header_format.diff, 
> derby-3907-7a2-use_new_framework.diff, derby-3907-7a2-use_new_framework.stat, 
> derby-3907-7a3-use_new_header_format.diff, 
> derby-3907-7a3-use_new_header_format.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-23 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-7a3-use_new_header_format.stat
derby-3907-7a3-use_new_header_format.diff

Patch 7a3 is the patch that enables the new header format.
I plan to commit this Tuesday or Wednesday next week.

Patch ready for review.

Remaining work:
 o Determine if the database version should be kept in the Database object.
It all depends on how expensive it is to consult the data dictionary about 
the version.
 o Investigate upgrade further to understand whether long columns can easily be 
upgraded with a new header or not.
 o (stretch task) Investigate how to add the functionality to update only the 
first few bytes of a data value.

NOTE: After this patch has been committed, databases containing CLOBs written 
with a Derby revision after the commit cannot be read with a Derby revision 
before the commit.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.stat, 
> derby-3907-7a1-write_new_header_format.diff, 
> derby-3907-7a2-use_new_framework.diff, derby-3907-7a2-use_new_framework.stat, 
> derby-3907-7a3-use_new_header_format.diff, 
> derby-3907-7a3-use_new_header_format.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-22 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-7a2-use_new_framework.stat
derby-3907-7a2-use_new_framework.diff

Patch 'derby-3907-7a2-use_new_framework.diff' is the second part of 7a.
It prepares the code to deal with multiple stream header formats, but doesn't 
change the current behavior regarding stream headers.

Committed to trunk with revision 736636.

I will upload the next patch, but will wait a few days before I commit it to 
see if any problems are detected with patch 7a2.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.stat, 
> derby-3907-7a1-write_new_header_format.diff, 
> derby-3907-7a2-use_new_framework.diff, derby-3907-7a2-use_new_framework.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-22 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-7a1-write_new_header_format.diff

I decided to split up patch 7a due to its size.
Patch 7a1 adds the new stream header generator classes (one interface, two 
implementations).

Committed to trunk with revision 736612.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.stat, 
> derby-3907-7a1-write_new_header_format.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-21 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-7a-write_new_header_format.diff

Replaced patch 7a (removed two lines of debugging code).

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-21 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Derby Info: [Patch Available]

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff, 
> derby-3907-7a-write_new_header_format.diff, 
> derby-3907-7a-write_new_header_format.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-21 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-7a-write_new_header_format.stat
derby-3907-7a-write_new_header_format.diff

The patch 'derby-3907-7a-write_new_header_format.diff' is the second attempt at 
implementing the handling of the new stream header format.

Hopefully, the performance of reading/writing CHAR, VARCHAR and LONG VARCHAR 
hasn't suffered, but I'll run some performance tests to confirm it. The 
performance for reading the old header format for Clobs (i.e. accessing old 
databases) may suffer a little bit, because in some situations too many bytes 
are read and the stream has to be reset before reading it again. I need to test 
this as well.

Description of the changes:
* EmbedResultSet
   Adjusted the usage of the ReaderToUTF8Stream constructor. Note the usage of 
'setSoftUpgradeMode', which is required because the stream can be read and 
written when the context stack hasn't been set up. This is true for updatable 
result sets. When there is no context, the context service fails to obtain the 
DatabaseContext used by the generator to get to the data dictionary.

 * EmbedPreparedStatement
   Adjusted the usage of the ReaderToUTF8Stream constructor.

 * ArrayInputStream
   Added an argument to 'readDerbyUTF'. The stream header is now read outside 
of this method.

 * StreamHeaderGenerator
   Added an interface for generating stream headers.

 * CharStreamHeaderGenerator
   New class generating old-style headers (two bytes long). Always used for 
CHAR, VARCHAR and LONG VARCHAR. In addition, it is also used for CLOB in pre 
10.5 databases (i.e. soft upgrade mode).

 * ClobStreamHeaderGenerator
   New class generating new-style headers (five bytes long). Used only for 
CLOBs written into a 10.5 database. If a old-style header is needed, the work 
is delegated to CharStreamHeaderGenerator.

 * ReaderToUTF8Stream
   Added  a StreamHeaderGenerator to the constructors, and updated the header 
writing logic to use it. Also added a constant to distinguish the first 
invocation of 'fillBuffer'. The header is generated on the first invocation, 
and possibly updated again in 'checkSufficientData'.

 * StringDataValue
   Replaced method 'generateStreamHeader(long)' with 
'getStreamHeaderGenerator()'. Added method 'setSoftUpgradeMode'. The latter is 
used in situations where the generator itself is unable to determine of the 
database being written into is in soft upgrade mode.

 * SQLChar
   Factored out code to write the modified UTF-8 format (see 'writeUTF'). 
Updated 'writeExternal', which will now only be invoked for non-CLOB data 
values. Added method 'writeClobUTF', which is used to write CLOB data values. 
It is kept in SQLChar to avoid having to make more of the internal state 
available to the subclasses. Added a second version of 'readExternal', which is 
the one doing the actual work. It takes both a byte count and a char count, 
where both can can by unknown. Implemented the new method in StringDataValue.

* SQLClob
   Added variable 'inSoftUpgradeMode', which tells if the DVD is used in a 
database being in soft upgrade mode or not. This must be known to generate the 
correct header format.  Note that this may be unknown, in which case the header 
generator itself will try to determine the mode. Implemented 'getLength', which 
will obtain the length from the stream header, delegate the work to 
'SQLChar.getLength' if the value is not a stream, or decode the stream data if 
the length is not stored in the header. The data value is not materialized. 
Added support to read both stream header formats in 'getStreamWithDescriptor'. 
Implemented 'investigateStream' to decode the header.  Added 'writeExternal', 
'readExternal' and 'readExternalFromArray'. In general, some preparation steps 
are taken and then the work is delegated to SQLChar. Added utility class 
HeaderInfo.

 * StreamHeaderHolder
   Deleted the class.

 * UTF8UtilTest
   Updated code to use the new generator class.

Patch ready for review.
I have run the regression tests without failures, but due to a small last 
minute change I will run them again tonight.
I have also tried reading and writing Clob from a 10.4 database manually, both 
in soft and hard upgrade mode.

Based on Mike's suggestion, I was hoping that a table compress would update the 
old headers to the new header format after a hard upgrade. This is in principle 
correct, but the new header is written with "unknown length" encoded. I haven't 
investigated how to best solve this problem.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Compo

[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-16 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-7a-write_new_header_format-PREVIEW.diff

Patch 'derby-3907-7a-write_new_header_format-PREVIEW.diff' enables the new 
header format.

I ran into some problems with obtaining the version of the database being 
written into, so I had to change where the header is generated. To be able to 
use the context service to gain access to the data dictionary, there must be 
context. The context is not pushed in for instance 
EmbedPreparedStatement.setCharacterStream. To solve this, I added the 
requirement that StringDataValue.generateStreamHeader has to be invoked when a 
context is set up.

The new approach is to generate the header when the store is asking for the 
data (during execute), which happens in ReaderToUTF8Stream.fillBuffer. The 
downside of the approach, is that ReaderToUTF8Stream now takes StringDataValue 
as an argument in the constructor. This makes a lot more code available in the 
reader, and it also makes the reader harder to test.

I also considered adding a method to tell the DVD the version of the database, 
but I think it will be hard to make Derby invoke this method in all valid 
use-cases, and it breaks with the pattern used in the existing classes.
A second option is to add a stream header generation object, which can be 
passed in to ReaderToUTF8Stream. I think this can be done by 
modifying/replacing the StreamHeaderHolder, and I think it can be done easily. 
The difference is that the header generation will be postponed.

I'll continue the work on Monday, and I will most likely post another patch 
implementing the approach with a separate class generating the header.

Please comment if you think I'm heading down the wrong road.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff, 
> derby-3907-7a-write_new_header_format-PREVIEW.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-15 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-6a-SQLClob_stream_descriptor_sync.diff

Attached a bug fix (6a)  for a problem where the stream could get out of sync 
with the descriptor.

Committed to trunk with revision 734758.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat, 
> derby-3907-6a-SQLClob_stream_descriptor_sync.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-15 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Derby Info:   (was: [Patch Available])

Committed patch 5a to trunk with revision 734630.

A quick comment on the following piece of code from 
EmbedResultSet.getCharacterStream:
-
CharacterStreamDescriptor csd = dvd.getStreamWithDescriptor();

if (csd == null) {

String val = dvd.getString();
if (lmfs > 0) {
if (val.length() > lmfs)
val = val.substring(0, lmfs);
}
java.io.Reader ret = new 
java.io.StringReader(val);
currentStream = ret;
return ret;
}

// See if we have to enforce a max field size.
if (lmfs > 0) {
csd = new CharacterStreamDescriptor.Builder().copyState(csd).
maxCharLength(lmfs).build();
}
java.io.Reader ret = new UTF8Reader(csd, this, syncLock);
-

The last "if (lmfs < 0)" statement will never be run as long as 
SQLChar.getStreamWithDescriptor() returns null, because then the value will be 
materialized (SQLChar.getString()).
We can allow non-Clob values to be treated as streams as well, but I'm not sure 
it is worth it due to the limited size (max 32700 chars).
Opinons?

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-13 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-5a-use_getStreamWithDescriptor.diff
derby-3907-5a-use_getStreamWithDescriptor.stat

Patch 'derby-3907-5a-use_getStreamWithDescriptor.diff' takes the new 
StringDataValue.getStreamWithDescriptor() into use.

Description of changes:
 o EmbedClob
   Changed constructor to take a StringDataValue instead of a 
DataValueDescriptor.
   Updated call to the StoreStreamClob constructor.

 o EmbedResultSet
   Started using the getStreamWithDescriptor method and updated invocations of 
the UTF8Reader constructor.

 o StoreStreamClob
   Added a CharacterStreamDescriptor, and made the constructor take one as an 
argument.
   Adapted the class to use a CSD.

 o UTF8Reader
   Updated some comments.
   Fixed bug where the header length wasn't added to the byte length of the 
stream, and updated the class appropriately (adjusted utfCount, fixed the reset 
routine).
   Made sure the header bytes are skipped (either by skipping them in the 
constructor or by adjusting the position to on the next reposition).

 o ResultSetStreamTest
   Added a test for maxFieldSize, where truncation have to happen.

 o Various tests
   Adjusted tests to run with the new implementation.

Patch ready for review.
Regression tests ran cleanly, but due to a last minute change I have to rerun 
them again. I will post the results tomorrow.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat, 
> derby-3907-5a-use_getStreamWithDescriptor.diff, 
> derby-3907-5a-use_getStreamWithDescriptor.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-13 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Derby Info: [Patch Available]

Committed patch 4a to trunk with revision 734148.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-13 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-4a-add_getStreamWithDescriptor.diff
derby-3907-4a-add_getStreamWithDescriptor.stat

Patch 'derby-3907-4a-add_getStreamWithDescriptor.diff' adds the method 
getStringWithDescriptor to StringDataValue.

It it intended to be used when getting a stream from a StringDataValue to be 
used with a Clob object, or with streaming of string data values in general. 
The DVD is responsible for returning a correct descriptor for the raw stream. 
The descriptor is in turn used by other classes to correctly configure 
themselves with respect to data offsets, buffering, repositioning and so on.
This patch was part of a bigger patch, but I decided to split it into two to 
make it easier to review.

Patch description:
 o CharacterStreamDescriptor
   Added a toString method and more verbose assert-messages.

 o StringDataValue
   Added method 'CharacterStreamDescriptor getStreamWithDescriptor()'.

 o SQLChar
   Made setStream non-final so it can be overridden in SQLClob.
   Added default implementation of getStreamWithDescriptor that always returns 
null. This means that all non-Clob string data types will be handled as strings 
instead of streams in situations where a stream is requested through 
getStreamWithDescriptor. I'll look into the performance implications of this a 
little later, when more of the final code is in place.
   Made throwStreamingIOException protected to access it from SQLClob.

 o SQLClob
   Implemented getStreamWithDescriptor, handling the old 2-byte format only.
   Overrid setStream to discard the stream descriptor when a new stream is set 
for the DVD.

Patch ready for review.
I will commit this shortly, but since the code isn't used yet it should be 
harmless. That shouldn't stop any reviewers though!
I'll also post the next patch shortly.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.diff, 
> derby-3907-4a-add_getStreamWithDescriptor.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-13 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

   Derby Info:   (was: [Patch Available])
Fix Version/s: 10.5.0.0

Committed patch 2c to trunk with revision 734065.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-12 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-2c-header_write_preparation.diff

Updated a new version of patch 2c. The only change is in 
ReaderToUTF8Stream.checkSufficientData, where I had used the wrong header 
holder to determine if an EOF marker was to be written to the stream.

With the exception of one failure in UTF8UtilTest, the regression tests ran 
successfully with JDK 1.6 on Solaris.
In rerunning with the updated patch.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-12 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-2c-header_write_preparation.stat
derby-3907-2c-header_write_preparation.diff

Patch 'derby-3907-2c-header_write_preparation.diff' is ready for review.

A description of the changes:
 o EmbedResultSet and EmbedPreparedStatement
   Started using the new ReaderToUTF8Stream constructor, where the stream 
header is passed in. Also started to treat the DataValueDescriptor as a 
StringDataValue, which should always be the case at this point in the code.

 o ReaderToUTF8Stream
   Added field 'header', which holds a StreamHeaderHolder coming from a 
StringDataValue object. Updated the constructors with a new argument.  The 
first execution of fillBuffer now uses the header holder to obtain the header 
length, and the header holder object is consulted when checking if the header 
can be updated with the length after the application stream has been drained. 
Note that updating the header with a character count is not yet supported. This 
will be added in another patch together with the handling of the new header 
format.

 o StringDataValue
   Added new method generateStreamHeader.

 o SQLChar
   Implemented generateStreamHeader, which always return a header for a stream 
with unknown length (see the constant). In most cases, the header will be 
updated after the stream has been drained. The exception is when the data 
values for VARCHAR and LONG VARCHAR are too long to be held in the byte buffer 
in ReaderToUTF8Reader, which can only happen when the string contains 
characters that require a two or three byte representation.

 o SQLClob
   Added a constant for a 10.5 stream header holder representing a stream with 
unknown character length. Also updated the use of the ReaderToUTF8Stream 
constructor.

 o StreamHeaderHolder
   Holder object for a stream header, containing the header itself and the 
following additional information; "instructions" on how to update the header 
with a new length, if the length is expected to be in number of bytes or 
characters, and if an EOF marker is expected to be appended to the stream. The 
object is considered immutable, but it is not copying the byte arrays passed in 
to the constructor. I found this unnecessary because this code is being called 
by internal code only, but should it still do defensive copying?

 o UTF8UTilTest
   Updated usage of the ReaderToUTF8Stream constructor, and replaced the 
hardcode byte count to skip with a call to the header holder object.

 o ClobTest (jdbc4)
   Added some simple tests inserting and fetching Clobs to test the basics of 
stream header handling.

 o StreamTruncationTest
   New test testing truncation of string data values when they are inserted as 
streams. Note the not-so-elegant error reporting (see catch clause in 
insertSmall)...

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-2c-header_write_preparation.diff, 
> derby-3907-2c-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-09 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-2c-header_write_preparation-PREVIEW.stat
derby-3907-2c-header_write_preparation-PREVIEW.diff

Attached 'derby-3907-2c-header_write_preparation-PREVIEW.diff'.
This is an early version of revision 2c. In general, Derby should behave as 
before, but it should now have the framework it needs to handle multiple header 
formats for the streams coming in.
This is a partial solution only, Derby must also learn how to handle multiple 
header formats for non-streaming situations. I expect most of these changes 
will come in SQLChar and SQLClob.
Finally, the new header format must be added.

I'm thinking about making StreamHeaderHolder immutable again, because the 
holder for a 10.4 header for a stream with unknown length will be used a lot. 
That way we can keep an instance in SQLChar that can be shared. I'll look some 
more at it an comment further.

Feel free to have a look and give me some early feedback :)

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-2c-header_write_preparation-PREVIEW.diff, 
> derby-3907-2c-header_write_preparation-PREVIEW.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-08 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-2b-header_write_preparation.diff

Refreshed patch 2b, as the old one don't apply any more due to other changes.
It has a few changes from the original version. The remaining comments from 
Knut Anders will be addressed in revision 2c.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-08 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-3b-readertoutf8stream_cleanup.diff

Attaching patch 3b.
I had to pull parts of the patch out. The error reporting for truncation 
failure during insert from a stream isn't adequate, and requires additional 
work. I filed DERBY-4005 for some of it, but I expect there will be another 
Jira as well.

The patch now changes ReaderToUTF8Stream and UTF8UtilTest only. See commit 
message for more detailed description of changes.

Committed 3b to trunk with revision 732676.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat, 
> derby-3907-3b-readertoutf8stream_cleanup.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-07 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-3a-readertoutf8stream_cleanup.diff

Wrong diff file... Reattaching 3a.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-07 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-3a-readertoutf8stream_cleanup.stat
derby-3907-3a-readertoutf8stream_cleanup.diff

Patch 3a cleans up ReaderToUTF8Stream and changes the behavior in way one.
It will now try to truncate CHAR and VARCHAR as well as CLOB. Only spaces are 
accepted as truncatable characters.
Truncation is disallowed for LONG VARCHAR.

In addition, I added some substance to the error message when truncation fails. 
I decided to not print the stream content, as it is not easily accessible and 
it may be very large.
I also added quite a bit of JavaDoc/comments, removed the instance variable 
maximumLength and simplified the constructors.
Since the type name is used to determine if truncation is allowed or not, I 
added a debug block verifying the name. As a consequence of this, I had to 
modify UTF8UtilTest.

Regression tests ran with two failures (stressmulti and replication), and due 
to some changes I'm re-running the tests.
Patch ready for review.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat, 
> derby-3907-3a-readertoutf8stream_cleanup.diff, 
> derby-3907-3a-readertoutf8stream_cleanup.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-05 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-2b-header_write_preparation.stat
derby-3907-2b-header_write_preparation.diff

'derby-3907-2b-header_write_preparation.diff' is a preparation patch to lay the 
foundation for writing the new stream header format. Essentially, the behavior 
should be the same before and after the patch.

The patch adds a new method to the interface StringDataValue, since the header 
format will only apply for string data types. The method generateStreamHdr 
creates the stream header based on the version of the dictionary and the 
character length information, and also determines if the stream have to be 
ended with a Derby-specific end-of-stream marker.
The current implementation simply returns unknown length and instructs to 
terminate the value with a Derby EOF marker. This is because Derby pre 10.5 
expects a byte count, not a char count. At this level, the byte count is 
generally unknown. The method generateStreamHdr will be overridden in SQLClob, 
and the new header format will be used there, unless we are running in 
soft-upgrade mode where the old format will be used.
The goal is that the knowledge about the exact format of the headers is 
contained in the DVD(s).

In ReaderToUTF8Stream I removed the instance variable maximumLength, because it 
is only needed in the constructor. Further, I added setHeader to allow the 
header to be overridden after the reader is instantiated. If the header isn't 
overridden, the reader should behave as before.

Note that things *may* have to be changed a bit if support for updating the 
header only is added (i.e. the lenghtless scenario).

Patch 2b ready for review.

FYI, patch 2a is almost identical, except that it doesn't use a utility/holder 
class for the header. I think using the utility class is cleaner, but it does 
of course introduce yet another class.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2009-01-05 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Derby Info: [Patch Available]

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff, 
> derby-3907-2b-header_write_preparation.diff, 
> derby-3907-2b-header_write_preparation.stat
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-11-26 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: (was: derby-3907-alternative_approach.diff)

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-11-26 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-1a-alternative_approach.diff

Attached the wrong file. Adding the correct one.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-1a-alternative_approach.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-11-26 Thread Kristian Waagan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-3907:
---

Attachment: derby-3907-alternative_approach.diff

I got stuck trying to implement the original solution, so I tried an 
alternative approach.

It is a lot simpler, but people might not like it. Note however, that it 
follows roughly the same pattern as Blob.
Note the patch is a quick mash-up, and I want some feedback from the community.

The alternative approach is to make all classes writing and reading data from 
store able to peek at it and determine which format it has to use to read/write 
the data.
Including my second format, we have these two byte formats:
 - current: D1_D2_DATA
 - new: D4_D3_M_D2_D1_DATA

M is a magic byte, and is used to detect the new format. It is a illegal UTF-8 
encoding, so it should not be possible to interpret it incorrectly as the first 
format and data.
I have set M to F0 (), but I'm masking out the last four bits when 
looking for the magic byte. This makes it possible to have arbitrary many 
formats, should that be necessary, the main point is to keep the four highest 
bits set.
With respect to data corruption (i.e. one bit getting flipped), is this 
approach safe enough?

So if we need to be able to store huge Clobs in the future, we could change M 
and use another format:
 - future: D6_D5_M_D4_D3_D2_D1_DATA
The same approach could be used to store other meta information.

The patch 'derby-3907-alternative_approach.diff' only changes behavior for 
small Clobs. To enable a new format for a larger Clob, the streaming classes 
have to be changed (ReaderToUTF8Stream, UTF8Reader).
It should be noted that these classes are used to write other character types 
(CHAR, VARCHAR) as well, and I do not intend to change how they are 
represented. This means that I have to include enough information to be able to 
do the correct thing.

While the format can be detected on read, an informed decision must be made on 
write. Now I'm consulting the data dictionary to check the database version, 
and if it is less than 10.5 I use th e old format. Is there a better way?


Regarding the original approach, I got stuck because the upper layers of Derby 
are sending down NULL values of the data types into store. The upper layer 
don't have any context information, and is unable to choose the correct 
implementation. The system doesn't seem to be set up for having multiple 
implementations of a single data type at this level.
I ended up with a series of hacks, for instance having store override the Clob 
implementation type, but it just didn't work very well. At one point I had 
normal, soft- and hard-upgraded working, but compress table failed. I'm sure 
this isn't the only path that will fail.

I might pick up the work again later, but right now I want to wait for a while 
and work on other issues.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
> Attachments: derby-3907-alternative_approach.diff
>
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-11-05 Thread Mike Matrigali (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-3907:
--


The updating discussion has been a bit confusing.   I am concentrating on the 
store interface to update an object on disk.  I realize there are a number of 
in memory clob update paths, but that is a different issue.  Currently there is 
no way to update "part" of the object in the store
interface, only the whole object.

So without store changes your normal use case for the lengthless updates would 
require the entire clob to be rewritten to disk.  

I think easiest would be to support a new store interface that allowed the 
update of first initial n bytes by an exact same number of bytes.  Currently 
for long clobs the usual case would be that we used all the bytes on the 1st 
page for the fist piece of the linked list, so expanding by any number of bytes 
would be a worst case.  We would log at least the entire "old" portion of the 
row.  If the lengths don't exactly match there is more logging complication.  
So my guess at ease of implementation from easiest to hardest supporting some 
subset of bytes update:
o exact leading number of bytes
o shrinking leading number of bytes
o expanding leading number of bytes
o arbitrary positioned number of bytes

This probably would mean a new log record, and thus need to be careful again 
about hard/softupgrade.  We have done log record updates (or new ones) in derby 
so an example
exists.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-11-05 Thread Mike Matrigali (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-3907:
--


> I'm a bit unsure how to handle the format id issue.
>
> It is not clear to me what the differences between the various clob fields in
StoredFormatIds are:
> - CLOB_TYPE_ID
> - CLOB_COMPILATION_TYPE_ID
> - CLOB_TYPE_ID_IMPL
> - SQL_CLOB_ID
For the reading and writing of the data to disk the SQL_CLOB_ID is the one.
It is associated with the SQLClob class.

The SQL layer when it creates a table gives the format id of each of the
collumns.  This format id is used by store to tell what object it should
create to interpret the bytes on disk.  So say you create a SQL_CLOB_VER2_ID,
and associated it with SQLClob2, and work the code such that new db's and/or
hard upgraded db's always use the new class/id when creating new tables.
Then Store will use SQLClob2 to read data into, and count on
SQLClob2.readExternal
SQLClob2.readExternalFromArray
SQLClob2.writeExternal

to read and write your new format.

Note that for current SQLClob these are all inherited from SQLChar currently.

In general what has been done in the pase is to switch the format ids in the
new code such that the "current" version has the existing name, and that
the old version has the new name.  So SQLClob would get the new version id,
and a new class would get the old version id, something like SQLClob_10_4.java.

For an example check out:
java/engine/org/apache/derby/impl/store/access/btree/index/B2I.java
java/engine/org/apache/derby/impl/store/access/btree/index/B2I_10_3.java
java/engine/org/apache/derby/impl/store/access/btree/index/B2I_v10_2.java


Again this is only talking about the issue of reading/writing disk at store
level.  There may be other code with stream dependencies that thinks it
"knows" the format of the stream.  The way this kind of thing has been handled
in the past is to make the runtime version of the object always look like the
"current" format.  For past upgrades this has been easy as it usually is just
another field in the object that one can pick a default for if it is an old
version.  In this way it isolates the upgrade code to just the to/from
disk part of the code.  With streams this may be more challenging.

>
> In the current implementation, I think I need to access this information up to
 the JDBC level, i.e. in classes like StoreStreamClob, SQLClob and possibly a Re
sultSet class.
> To me it seems I need two versions of the SQL_CLOB_ID, but I'm not sure.
> Can anyone with more knowledge about the type system guide me?
>
> I do think we need a new format id, because the current format (two bytes that
 can have any values) makes it hard to add another stream header format.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-10-10 Thread Mike Matrigali (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-3907:
--


I will post some more comments on your proposal, but need some clarification.

What does the following mean?  Will the changes apply to all sql which inserts 
clobs, or to only particular jdbc interfaces?
1) Clob modifications are done on a copy (i.e. TemporaryClob).

What is the expected call sequence to store, and what is the goal performance 
characteristic?  ie.
for an insert of non-lengthed clob is it something like:
insert unlength clob into store, calculating length along the way
update clob in store changing N leading bytes



> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (DERBY-3907) Save useful length information for Clobs in store

2008-10-10 Thread Mike Matrigali (JIRA)

 [ 
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-3907:
--


upgrade comments:
o In keeping with the current Derby goals I believe any change of format should 
continue to support
   the hard/soft upgrade paradigm.  Given that any change will need to have 
code that supports both
   the old and new format.

o So then the decision is when to use the new format.  At high level (though 
not sure all are feasible)
   are:
   1) every different clob in existing table
   2) only new tables support new format, in hard upgraded db's support new 
format
   3) only newly created DB's support new format.

1) Off hand I don't think we have enough per data value information to figure 
out an olf format clob
 vs. a new format clob - but maybe someone else can come up with a magic 
sequence of initial
 bytes.  It would be nice if the new format made it so in the future other 
possible optimizations 
 could be handled on a per row basis - with some sort of format byte.
2) This option seems the most flexible and readily implemented.  By creating a 
new format id for
 the new type of clob then the code can be made to easily know the 
difference between a table
 of old clobs and new clobs.  If someone really wants to data upgrade an 
old db then an offline
 compress in a hard upgraded db should automatically do it, with the added 
benefit of defraging
 and compressing the tables.
3) I think this is sort of like #2 and is not really any less code.  

o Also there is a decision on what to do at hard upgrade time.  Either the 
newly hard upgrade db supports both formats or one had to do a per row data 
upgrade.  No Derby release has done this
(and it was never done before the code was donated to apache either).  My 
opinion would be to 
continue to support both formats.

> Save useful length information for Clobs in store
> -
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
>  Issue Type: Improvement
>  Components: JDBC, Store
>Affects Versions: 10.5.0.0
>Reporter: Kristian Waagan
>Assignee: Kristian Waagan
>
> The store should save useful length information for Clobs. This allows the 
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also 
> contains some background information: 
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be 
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.