[
https://issues.apache.org/jira/browse/CASSANDRA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839705#action_12839705
]
Jonathan Ellis commented on CASSANDRA-836:
------------------------------------------
That's only for bitsets where the size hasn't been explicitly requested (ours
always is).
> CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't
> change.
> ---------------------------------------------------------------------------------
>
> Key: CASSANDRA-836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-836
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: n/a - all
> Reporter: Ross M
> Priority: Minor
> Attachments: BitSetSerializer.java
>
>
> CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't
> grow. there are pieces of the header (BitSet) that are serialized with java
> serialization which makes no such promises.
> the following code:
> /** writes header at the beginning of the file, then seeks back to
> current position */
> void seekAndWriteCommitLogHeader(byte[] bytes) throws IOException
> {
> long currentPos = logWriter.getFilePointer();
> logWriter.seek(0);
> writeCommitLogHeader(bytes);
> logWriter.seek(currentPos);
> }
> works fine as long as the header size doesn't change, but if it grows the new
> header will over write the beginning of the data segment. the bit-set being
> written in the header happens to serialize to the same size, but there is no
> guarantee of this.
> i found this when looking at optimizing the serialization of data to disk
> (thus improving write throughput/performance.) i removed the
> ObjectOutputStream serialization in BitSetSerializer and replaced it with a
> custom serialization that omits the generic java
> serialization/ObjectOutputStream stuff and just writes on the "true" bits.
> the custom serialization worked fine, but broke other parts of the code when
> the header bitset had new bits turned on, thus growing the header's size,
> data segment bytes were overwritten.
> the serialized version of a BitSet can grow in a similar manner, no pomises
> of size/consistency are made, but with current use it luckily doesn't seem to
> happen.
> a good fix is unclear. without forcing the header to be a fixed/constant size
> in some manner this problem could pop up at any point. it's generally not
> safe to rewrite headers like this without custom code that ensures the size
> doesn't change. one fix would be to manually write all of the header data out
> (rather than relying on java serialization and serialization code in other
> parts of cassandra not to change.) another might be to pad the size of the
> header so that the data inside can grow, but that seems fraught with
> (potential) problems. (i've played around with padding the header length, but
> that seems to cause other things to break, which i haven't been able to track
> down yet.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.