[ 
https://issues.apache.org/jira/browse/CASSANDRA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839479#action_12839479
 ] 

Jonathan Ellis edited comment on CASSANDRA-836 at 2/28/10 7:20 PM:
-------------------------------------------------------------------

it's not a bug, because we never change the size of the bitset.

if you want to add an assertion to that effect, fine, but making the 
serialization handle a situation that would be a horrible bug, is bad design.

      was (Author: jbellis):
    it's not a bug, because we never change the size of the bitset
  
> CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't 
> change.
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-836
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-836
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: n/a - all
>            Reporter: Ross M
>            Priority: Minor
>         Attachments: BitSetSerializer.java
>
>
> CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't 
> grow. there are pieces of the header (BitSet) that are serialized with java 
> serialization which makes no such promises. 
> the following code:
>     /** writes header at the beginning of the file, then seeks back to 
> current position */
>     void seekAndWriteCommitLogHeader(byte[] bytes) throws IOException
>     {
>         long currentPos = logWriter.getFilePointer();
>         logWriter.seek(0);
>         writeCommitLogHeader(bytes);
>         logWriter.seek(currentPos);
>     }
> works fine as long as the header size doesn't change, but if it grows the new 
> header will over write the beginning of the data segment. the bit-set being 
> written in the header happens to serialize to the same size, but there is no 
> guarantee of this.
> i found this when looking at optimizing the serialization of data to disk 
> (thus improving write throughput/performance.) i removed the 
> ObjectOutputStream serialization in BitSetSerializer and replaced it with a 
> custom serialization that omits the generic java 
> serialization/ObjectOutputStream stuff and just writes on the "true" bits. 
> the custom serialization worked fine, but broke other parts of the code when 
> the header bitset had new bits turned on, thus growing the header's size, 
> data segment bytes were overwritten.
> the serialized version of a BitSet can grow in a similar manner, no pomises 
> of size/consistency are made, but with current use it luckily doesn't seem to 
> happen.
> a good fix is unclear. without forcing the header to be a fixed/constant size 
> in some manner this problem could pop up at any point. it's generally not 
> safe to rewrite headers like this without custom code that ensures the size 
> doesn't change. one fix would be to manually write all of the header data out 
> (rather than relying on java serialization and serialization code in other 
> parts of cassandra not to change.) another might be to pad the size of the 
> header so that the data inside can grow, but that seems fraught with 
> (potential) problems. (i've played around with padding the header length, but 
> that seems to cause other things to break, which i haven't been able to track 
> down yet.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to