Sylvain Lebresne created CASSANDRA-5454:
-------------------------------------------
Summary: Changing column_index_size_in_kb on different nodes might
corrupt files
Key: CASSANDRA-5454
URL: https://issues.apache.org/jira/browse/CASSANDRA-5454
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 2.0
RangeTombstones requires that we sometimes repeat a few markers in the data
file at index boundaries. Meaning that the same row with different
column_index_size_in_kb will not have the same data size.
This is a problem for streaming, because if the column_index_size_in_kb is
different in the source and the destination, the resulting row should have a
different size on the destination, but streaming rely on the data size not
changing in 1.2.
Now, while having different column_index_size on different nodes is probably
not extremely useful in the long run, you may still have temporal discrepancies
because there is no real way to change the setting on all node atomically.
Besides, it's not to hard to get different setting on different nodes due to
human error. And currently, the result is that if a file is stream while the
setting is not consistent, then we'll end up corrupting the received file (due
to the fix from CASSANDRA-5418 to be precise).
I don't see a good way to fix this in 1.2, so users will have to be careful not
to have streaming happening while they change the column_index_size_in_kb
setting. But in 2.0, once CASSANDRA-4180 is committed, we won't have the
problem of having to respect the dataSize from the source on the destination
anymore. So basically we should revert the fix from CASSANDRA-5418 (though we
may still want to avoid repeating unneeded marker, but the tombstoneTracker can
give us that easily).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira