[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616594#comment-15616594 ] Arvind Nithrakashyap commented on CASSANDRA-12539: -- There seems to be a very easy way to reproduce this condition - so it seems like Cassandra should be able to handle this case without needing manual intervention at each occurrence. Especially when it gets deployed in an enterprise setting, when services are expected to auto heal. > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614667#comment-15614667 ] Arvind Nithrakashyap edited comment on CASSANDRA-12539 at 10/28/16 7:54 AM: It looks like a crash while writing the logfile can lead to a zero byte file. The following patch, which causes a crash after creating the buffer but before writing the header, produces a commit log full of zeros. This leads to the error described above when it tries to replay the commit log. {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 0a03c3c..20cddf8 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -153,6 +153,7 @@ public abstract class CommitLogSegment descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass()); logFile = new File(commitLog.location, descriptor.fileName()); +logger.error("location="+descriptor.fileName()); try { channel = FileChannel.open(logFile.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ, StandardOpenOption.CREATE); @@ -164,6 +165,9 @@ public abstract class CommitLogSegment } buffer = createBuffer(commitLog); +if (true) { + throw new IllegalArgumentException("Here!"); +} // write the header CommitLogDescriptor.writeHeader(buffer, descriptor); endOfBuffer = buffer.capacity(); {noformat} was (Author: anithrak): It looks like a crash while writing the logfile can lead to a zero byte file. The following patch which causes a crash at a specific point reliably produces a commit log full of zeros {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 0a03c3c..20cddf8 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -153,6 +153,7 @@ public abstract class CommitLogSegment descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass()); logFile = new File(commitLog.location, descriptor.fileName()); +logger.error("location="+descriptor.fileName()); try { channel = FileChannel.open(logFile.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ, StandardOpenOption.CREATE); @@ -164,6 +165,9 @@ public abstract class CommitLogSegment } buffer = createBuffer(commitLog); +if (true) { + throw new IllegalArgumentException("Here!"); +} // write the header CommitLogDescriptor.writeHeader(buffer, descriptor); endOfBuffer = buffer.capacity(); {noformat} > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944
[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614667#comment-15614667 ] Arvind Nithrakashyap commented on CASSANDRA-12539: -- It looks like a crash while writing the logfile can lead to a zero byte file. The following patch which causes a crash at a specific point reliably produces a commit log full of zeros {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 0a03c3c..20cddf8 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -153,6 +153,7 @@ public abstract class CommitLogSegment descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass()); logFile = new File(commitLog.location, descriptor.fileName()); +logger.error("location="+descriptor.fileName()); try { channel = FileChannel.open(logFile.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ, StandardOpenOption.CREATE); @@ -164,6 +165,9 @@ public abstract class CommitLogSegment } buffer = createBuffer(commitLog); +if (true) { + throw new IllegalArgumentException("Here!"); +} // write the header CommitLogDescriptor.writeHeader(buffer, descriptor); endOfBuffer = buffer.capacity(); {noformat} > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614670#comment-15614670 ] Arvind Nithrakashyap commented on CASSANDRA-12539: -- The following patch seems to fix the issue {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index af8efb4..8f11d13 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@ -226,7 +226,7 @@ public class CommitLogReplayer { if (end != 0 || filecrc != 0) { -handleReplayError(false, +handleReplayError(tolerateTruncation, "Encountered bad header at position %d of commit log %s, with invalid CRC. " + "The end of segment marker should be zero.", offset, reader.getPath()); @@ -345,6 +345,14 @@ public class CommitLogReplayer return; } + +int currentFilePointer = (int) reader.getFilePointer(); +if (readSyncMarker(desc, currentFilePointer, reader, true) < 0) { +logger.info("Skipping empty logfile {}", file.getName()); +return; +} +reader.seek(currentFilePointer); + final long segmentId = desc.id; try { {noformat} > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12778) Tombstones not being deleted when only_purge_repaired_tombstones is enabled
[ https://issues.apache.org/jira/browse/CASSANDRA-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Nithrakashyap updated CASSANDRA-12778: - Description: When we use only_purge_repaired_tombstones for compaction, we noticed that tombstones are no longer being deleted. {noformat}compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'only_purge_repaired_tombstones': 'true'}{noformat} The root cause for this seems to be caused by the fact that repair itself issues a flush which in turn leads to a new sstable being created (which is not in the repair set). It looks like we do have some old data in this sstable because of this, only tombstones older than that timestamp are getting deleted even though many more keys have been repaired. Fundamentally it looks like flush and repair can race with each other and with leveled compaction, the flush creates a new sstable at level 0 and removes the older sstable (the one that is picked for repair). Since repair itself seems to issue multiple flushes, the level 0 sstable never gets repaired and hence tombstones never get deleted. We have already included the fix for CASSANDRA-12703 while testing. was: When we use only_purge_repaired_tombstones for compaction, we noticed that tombstones are no longer being deleted. {noformat}compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'only_purge_repaired_tombstones': 'true'}{noformat} The root cause for this seems to be caused by the fact that repair itself issues a flush which in turn leads to a new sstable being created (which is not in the repair set). It looks like we do have some old data in this sstable because of only tombstones older than that timestamp are getting deleted even though many more keys have been repaired. Fundamentally it looks like flush and repair can race with each other and with leveled compaction, the flush creates a new sstable at level 0 and removes the older sstable (the one that is picked for repair). Since repair itself seems to issue multiple flushes, the level 0 sstable never gets repaired and hence tombstones never get deleted. We have already included the fix for CASSANDRA-12703 while testing. > Tombstones not being deleted when only_purge_repaired_tombstones is enabled > --- > > Key: CASSANDRA-12778 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12778 > Project: Cassandra > Issue Type: Bug >Reporter: Arvind Nithrakashyap >Priority: Critical > > When we use only_purge_repaired_tombstones for compaction, we noticed that > tombstones are no longer being deleted. > {noformat}compaction = {'class': > 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', > 'only_purge_repaired_tombstones': 'true'}{noformat} > The root cause for this seems to be caused by the fact that repair itself > issues a flush which in turn leads to a new sstable being created (which is > not in the repair set). It looks like we do have some old data in this > sstable because of this, only tombstones older than that timestamp are > getting deleted even though many more keys have been repaired. > Fundamentally it looks like flush and repair can race with each other and > with leveled compaction, the flush creates a new sstable at level 0 and > removes the older sstable (the one that is picked for repair). Since repair > itself seems to issue multiple flushes, the level 0 sstable never gets > repaired and hence tombstones never get deleted. > We have already included the fix for CASSANDRA-12703 while testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12778) Tombstones not being deleted when only_purge_repaired_tombstones is enabled
[ https://issues.apache.org/jira/browse/CASSANDRA-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Nithrakashyap updated CASSANDRA-12778: - Summary: Tombstones not being deleted when only_purge_repaired_tombstones is enabled (was: Tombstones not being Deleted when only_purge_repaired_tombstones is enabled) > Tombstones not being deleted when only_purge_repaired_tombstones is enabled > --- > > Key: CASSANDRA-12778 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12778 > Project: Cassandra > Issue Type: Bug >Reporter: Arvind Nithrakashyap >Priority: Critical > > When we use only_purge_repaired_tombstones for compaction, we noticed that > tombstones are no longer being deleted. > {noformat}compaction = {'class': > 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', > 'only_purge_repaired_tombstones': 'true'}{noformat} > The root cause for this seems to be caused by the fact that repair itself > issues a flush which in turn leads to a new sstable being created (which is > not in the repair set). It looks like we do have some old data in this > sstable because of only tombstones older than that timestamp are getting > deleted even though many more keys have been repaired. > Fundamentally it looks like flush and repair can race with each other and > with leveled compaction, the flush creates a new sstable at level 0 and > removes the older sstable (the one that is picked for repair). Since repair > itself seems to issue multiple flushes, the level 0 sstable never gets > repaired and hence tombstones never get deleted. > We have already included the fix for CASSANDRA-12703 while testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12778) Tombstones not being Deleted when only_purge_repaired_tombstones is enabled
Arvind Nithrakashyap created CASSANDRA-12778: Summary: Tombstones not being Deleted when only_purge_repaired_tombstones is enabled Key: CASSANDRA-12778 URL: https://issues.apache.org/jira/browse/CASSANDRA-12778 Project: Cassandra Issue Type: Bug Reporter: Arvind Nithrakashyap Priority: Critical When we use only_purge_repaired_tombstones for compaction, we noticed that tombstones are no longer being deleted. {noformat}compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'only_purge_repaired_tombstones': 'true'}{noformat} The root cause for this seems to be caused by the fact that repair itself issues a flush which in turn leads to a new sstable being created (which is not in the repair set). It looks like we do have some old data in this sstable because of only tombstones older than that timestamp are getting deleted even though many more keys have been repaired. Fundamentally it looks like flush and repair can race with each other and with leveled compaction, the flush creates a new sstable at level 0 and removes the older sstable (the one that is picked for repair). Since repair itself seems to issue multiple flushes, the level 0 sstable never gets repaired and hence tombstones never get deleted. We have already included the fix for CASSANDRA-12703 while testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-7441) Deleting an element from a list in UPDATE does not work with IF condition
Arvind Nithrakashyap created CASSANDRA-7441: --- Summary: Deleting an element from a list in UPDATE does not work with IF condition Key: CASSANDRA-7441 URL: https://issues.apache.org/jira/browse/CASSANDRA-7441 Project: Cassandra Issue Type: Bug Components: Core Reporter: Arvind Nithrakashyap Priority: Critical Fix For: 2.0.7 When issuing a list deletion with an IF condition, that does not seem to work even when it says that the change was applied correctly. Here's a reproducible test case: {code} cqlsh:casstest create table foo(id text, values listint, condition int, primary key(id)); cqlsh:casstest insert into foo(id, values, condition) values ('a', [1,2,3], 0); cqlsh:casstest select * from foo; id | condition | values +---+--- a | 0 | [1, 2, 3] (1 rows) cqlsh:casstest update foo set values = values - [3] where id = 'a' IF condition = 0; [applied] --- True cqlsh:casstest select * from foo; id | condition | values +---+--- a | 0 | [1, 2, 3] (1 rows) cqlsh:casstest update foo set values = values - [3] where id = 'a'; cqlsh:casstest select * from foo; id | condition | values +---+ a | 0 | [1, 2] (1 rows) {code} Addition seems to work though {code} cqlsh:casstest update foo set values = values + [3] where id = 'a' IF condition = 0; [applied] --- True cqlsh:casstest select * from foo; id | condition | values +---+--- a | 0 | [1, 2, 3] (1 rows) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)