[
https://issues.apache.org/jira/browse/CASSANDRA-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15038408#comment-15038408
]
Ariel Weisberg commented on CASSANDRA-10593:
--------------------------------------------
There seem to be two blockers. One is that issue where archiving existing
segments at startup fails if they have already been archived. The other is that
even when segment recycling is disabled we recycle segments after commit log
replay instead of deleting them. Since they were hard linked into the archive
directory we end up modifying the header and now the name and header no longer
match and we have blown away the contents of the archived segment.
> Unintended interactions between commitlog archiving and commitlog recycling
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-10593
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10593
> Project: Cassandra
> Issue Type: Bug
> Reporter: J.B. Langston
> Assignee: Ariel Weisberg
> Attachments: cassandra.yaml, commitlog_archiving.properties,
> system.log
>
>
> Currently the comments in commitlog_archiving.properties suggest using either
> cp or ln for the archive_command.
> Using ln is problematic because commitlog recycling marks segments as
> recycled once the corresponding memtables are flushed and Cassandra will no
> longer replay them. This means it's only possible to do PITR on any records
> that were written since the last flush.
> Using cp works, and this is currently how OpsCenter does for PITR, however
> [~brandon.williams] has pointed out this could have some performance impact
> because of the additional I/O overhead of copying the commitlog segments.
> Starting in 2.1, we can disable commit log recycling in cassandra.yaml so I
> thought this would allow me to do PITR without the extra overhead of using
> cp. However, when I disable commitlog recycling and try to do a PITR,
> Cassandra blows up when trying to replay the restored commit logs:
> {code}
> ERROR 16:56:42 Exception encountered during startup
> java.lang.IllegalStateException: Cannot safely construct descriptor for
> segment, as name and header descriptors do not match ((4,1445878452545) vs
> (4,1445876822565)): /opt/dse/backup/CommitLog-4-1445876822565.log
> at
> org.apache.cassandra.db.commitlog.CommitLogArchiver.maybeRestoreArchive(CommitLogArchiver.java:207)
> ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:116)
> ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:352)
> ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
> at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:335)
> ~[dse-core-4.8.0.jar:4.8.0]
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:537)
> ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
> at com.datastax.bdp.DseModule.main(DseModule.java:75)
> [dse-core-4.8.0.jar:4.8.0]
> java.lang.IllegalStateException: Cannot safely construct descriptor for
> segment, as name and header descriptors do not match ((4,1445878452545) vs
> (4,1445876822565)): /opt/dse/backup/CommitLog-4-1445876822565.log
> at
> org.apache.cassandra.db.commitlog.CommitLogArchiver.maybeRestoreArchive(CommitLogArchiver.java:207)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:116)
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:352)
> at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:335)
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:537)
> at com.datastax.bdp.DseModule.main(DseModule.java:75)
> Exception encountered during startup: Cannot safely construct descriptor for
> segment, as name and header descriptors do not match ((4,1445878452545) vs
> (4,1445876822565)): /opt/dse/backup/CommitLog-4-1445876822565.log
> INFO 16:56:42 DSE shutting down...
> INFO 16:56:42 All plugins are stopped.
> ERROR 16:56:42 Exception in thread Thread[Thread-2,5,main]
> java.lang.AssertionError: null
> at
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1403)
> ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
> at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:196)
> ~[dse-core-4.8.0.jar:4.8.0]
> at com.datastax.bdp.server.DseDaemon.preStop(DseDaemon.java:426)
> ~[dse-core-4.8.0.jar:4.8.0]
> at com.datastax.bdp.server.DseDaemon.safeStop(DseDaemon.java:436)
> ~[dse-core-4.8.0.jar:4.8.0]
> at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:676)
> ~[dse-core-4.8.0.jar:4.8.0]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_31]
> {code}
> For the sake of completeness, I also tested using cp for the archive_command
> and commitlog recycling disabled, and PITR works as expected, but this of
> course defeats the point.
> It would be good to have some guidance on what is supported here. If ln isn't
> expected to work at all, it shouldn't be documented as an acceptable option
> for the archive_command in commitlog_archiving.properties. If it should work
> with commitlog recycling disabled, the bug causing the IllegalStateException
> needs to be fixed.
> It would also be good to do some testing and quantify the performance impact
> of enabling commitlog archiving using cp as the archve_command.
> I realize there are several different issues described here, so maybe they
> should be separate JIRAs, but first I wanted to just clarify whether we want
> to support ln at all, and we can go from there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)