Hi all,

I wanted to echo back on this thread a bit of a "win".  In investigating
ways to mitigate the "corruption on hard shutdown" issue, we came across
the Group Commitlog feature that was added in 4.0 (
https://issues.apache.org/jira/browse/CASSANDRA-13530).  We backported and
enabled this feature with "commitlog_sync_group_window_in_ms: 2" and the
results are:
- As expected, IOPS on the commitlog drive dropped drastically and no
longer scaled by number of writes.
- Write performance did not change significantly, and there was no impact
to our application (Cassandra write performance >2ms did not seem to be a
bottleneck)
- We've had *zero* commitlog corruption errors since we rolled this out to
our fleet 6 months ago!! Previously using batch commitlog, we faced 1-2
corruptions per month.

Cheers,
Leon


On Tue, Aug 3, 2021 at 11:39 PM Leon Zaruvinsky <leonzaruvin...@gmail.com>
wrote:

> Following up, I've found that we tend to encounter one of three types of
> exceptions/commitlog corruptions:
>
> 1.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
> Mutation checksum failure at ... in CommitLog-5-1531150627243.log
> at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638)
>
> 2.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
> Could not read commit log descriptor in file CommitLog-5-1550003067433.log
> at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638)
>
> 3.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
> Encountered bad header at position ... of commit log
> CommitLog-5-1603991140803.log, with invalid CRC. The end of segment marker
> should be zero.
> at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:647)
>
> I believe exception (2) is mitigated by
> https://issues.apache.org/jira/browse/CASSANDRA-11995 and
> https://issues.apache.org/jira/browse/CASSANDRA-13918
>
> But it's not clear to me how (1) and (3) can be mitigated.
>
> On Mon, Jul 26, 2021 at 6:40 PM Leon Zaruvinsky <leonzaruvin...@gmail.com>
> wrote:
>
>> Thanks for the links/comments Jeff and Bowen.
>>
>> We run xfs. Not sure that we can switch to zfs, so a different solution
>> would be preferred.
>>
>> I’ll take a look through that patch – maybe I’ll try to backport and
>> replicate.  We’ve seen both cases where the commitlog is just 0s (empty)
>> and where it has had real data in it.
>>
>> Leon
>>
>> On Mon, Jul 26, 2021 at 6:38 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>
>>> The commitlog code has changed DRASTICALLY between 2.x and trunk.
>>>
>>> If it's really a bunch of trailing 0s as was suggested later, then
>>> https://issues.apache.org/jira/browse/CASSANDRA-11995 addresses at
>>> least one cause/case of that particular bug.
>>>
>>>
>>>
>>> On Mon, Jul 26, 2021 at 3:11 PM Leon Zaruvinsky <
>>> leonzaruvin...@gmail.com> wrote:
>>>
>>>> And for completeness, a sample stack trace:
>>>>
>>>> ERROR [2021-07-21T02:11:01.994Z] 
>>>> org.apache.cassandra.db.commitlog.CommitLog: Failed commit log replay. 
>>>> Commit disk failure policy is stop_on_startup; terminating thread 
>>>> (throwable0_message: Mutation checksum failure at 15167277 in 
>>>> CommitLog-5-1626828286977.log)
>>>> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
>>>>  Mutation checksum failure at 15167277 in CommitLog-5-1626828286977.log
>>>>    at 
>>>> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:647)
>>>>    at 
>>>> org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:519)
>>>>    at 
>>>> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:401)
>>>>    at 
>>>> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143)
>>>>    at 
>>>> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:175)
>>>>    at 
>>>> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:155)
>>>>    at 
>>>> org.apache.cassandra.service.CassandraDaemon.recoverCommitlogAndCompleteSetup(CassandraDaemon.java:296)
>>>>    at 
>>>> org.apache.cassandra.service.CassandraDaemon.completeSetupMayThrowSstableException(CassandraDaemon.java:289)
>>>>    at 
>>>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:222)
>>>>    at 
>>>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:630)
>>>>    at 
>>>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:741)
>>>>
>>>>
>>>> On Mon, Jul 26, 2021 at 6:08 PM Leon Zaruvinsky <
>>>> leonzaruvin...@gmail.com> wrote:
>>>>
>>>>> Currently we're using commitlog_batch:
>>>>>
>>>>>     commitlog_sync: batch
>>>>>     commitlog_sync_batch_window_in_ms: 2
>>>>>     commitlog_segment_size_in_mb: 32
>>>>>
>>>>> durable_writes is also true.
>>>>>
>>>>> Unfortunately we are still using Cassandra 2.2.x :( Though I'd be
>>>>> curious if much in this space has changed since then (I've looked through
>>>>> the changelogs and nothing stood out).
>>>>>
>>>>> On Mon, Jul 26, 2021 at 5:20 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>>>
>>>>>> What commitlog settings are you using?
>>>>>>
>>>>>> Default is periodic with 10s sync. That leaves you a 10s window on
>>>>>> hard poweroff/crash.
>>>>>>
>>>>>> I would also expect cassandra to cleanup and start cleanly, which
>>>>>> version are you running?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 26, 2021 at 1:00 PM Leon Zaruvinsky <
>>>>>> leonzaruvin...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Cassandra community,
>>>>>>>
>>>>>>> We (and others) regularly run into commit log corruptions that are
>>>>>>> caused by Cassandra, or the underlying infrastructure, being hard
>>>>>>> restarted.  I suspect that this is because it happens in the middle of a
>>>>>>> commitlog file write to disk.
>>>>>>>
>>>>>>> Could anyone point me at resources / code to understand why this is
>>>>>>> happening?  Shouldn't Cassandra not be acking writes until the 
>>>>>>> commitlog is
>>>>>>> safely written to disk?  I would expect that on startup, Cassandra 
>>>>>>> should
>>>>>>> be able to clean up bad commitlog files and recover gracefully.
>>>>>>>
>>>>>>> I've seen various references online to this issue as something that
>>>>>>> will be fixed in the future - so I'm curious if there is any movement or
>>>>>>> thoughts there.
>>>>>>>
>>>>>>> Thanks a bunch,
>>>>>>> Leon
>>>>>>>
>>>>>>

Reply via email to