The dmesg command will usually show information about hardware errors.
An example from a spinning disk:
sd 0:0:10:0: [sdi] Unhandled sense code
sd 0:0:10:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:10:0: [sdi] Sense Key : Medium Error [current]
Info fld=0x6fc72
sd 0:0:10:0: [sdi] Add. Sense: Unrecovered read error
sd 0:0:10:0: [sdi] CDB: Read(10): 28 00 00 06 fc 70 00 00 08 00
Also, you can read the file like "cat
/data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db > /dev/null"
If you get an error message, it's probably a hardware issue.
- Erik -
________________________________
From: Philip Ó Condúin <[email protected]>
Sent: Thursday, August 8, 2019 09:58
To: [email protected] <[email protected]>
Subject: Re: Datafile Corruption
Hi Jon,
Good question, I'm not sure if we're using NVMe, I don't see /dev/nvme but we
could still be using it.
We using Cisco UCS C220 M4 SFF so I'm just going to check the spec.
Our Kernal is the following, we're using REDHAT so I'm told we can't upgrade
the version until the next major release anyway.
root@cass 0 17:32:28 ~ # uname -r
3.10.0-957.5.1.el7.x86_64
Cheers,
Phil
On Thu, 8 Aug 2019 at 17:35, Jon Haddad
<[email protected]<mailto:[email protected]>> wrote:
Any chance you're using NVMe with an older Linux kernel? I've seen a *lot*
filesystem errors from using older CentOS versions. You'll want to be using a
version > 4.15.
On Thu, Aug 8, 2019 at 9:31 AM Philip Ó Condúin
<[email protected]<mailto:[email protected]>> wrote:
@Jeff - If it was hardware that would explain it all, but do you think it's
possible to have every server in the cluster with a hardware issue?
The data is sensitive and the customer would lose their mind if I sent it
off-site which is a pity cause I could really do with the help.
The corruption is occurring irregularly on every server and instance and column
family in the cluster. Out of 72 instances, we are getting maybe 10 corrupt
files per day.
We are using vnodes (256) and it is happening in both DC's
@Asad - internode compression is set to ALL on every server. I have checked
the packets for the private interconnect and I can't see any dropped packets,
there are dropped packets for other interfaces, but not for the private ones, I
will get the network team to double-check this.
The corruption is only on the application schema, we are not getting corruption
on any system or cass keyspaces. Corruption is happening in both DC's. We are
getting corruption for the 1 application schema we have across all tables in
the keyspace, it's not limited to one table.
Im not sure why the app team decided to not use default compression, I must ask
them.
I have been checking the /var/log/messages today going back a few weeks and can
see a serious amount of broken pipe errors across all servers and instances.
Here is a snippet from one server but most pipe errors are similar:
Jul 9 03:00:08 cassandra: INFO 02:00:08 Writing
Memtable-sstable_activity@1126262628(43.631KiB serialized bytes, 18072 ops,
0%/0% of on/off-heap limit)
Jul 9 03:00:13 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
Jul 9 03:00:19 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
Jul 9 03:00:22 cassandra: ERROR 02:00:22 Got an IOException during write!
Jul 9 03:00:22 cassandra: java.io.IOException: Broken pipe
Jul 9 03:00:22 cassandra: at sun.nio.ch.FileDispatcherImpl.write0(Native
Method) ~[na:1.8.0_172]
Jul 9 03:00:22 cassandra: at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_172]
Jul 9 03:00:22 cassandra: at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_172]
Jul 9 03:00:22 cassandra: at sun.nio.ch.IOUtil.write(IOUtil.java:65)
~[na:1.8.0_172]
Jul 9 03:00:22 cassandra: at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_172]
Jul 9 03:00:22 cassandra: at
org.apache.thrift.transport.TNonblockingSocket.write(TNonblockingSocket.java:165)
~[libthrift-0.9.2.jar:0.9.2]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.util.mem.Buffer.writeTo(Buffer.java:104)
~[thrift-server-0.3.7.jar:na]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.streamTo(FastMemoryOutputTransport.java:112)
~[thrift-server-0.3.7.jar:na]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.Message.write(Message.java:222)
~[thrift-server-0.3.7.jar:na]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.handleWrite(TDisruptorServer.java:598)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:569)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.java:423)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:22 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.run(TDisruptorServer.java:383)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:25 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
Jul 9 03:00:30 cassandra: ERROR 02:00:30 Got an IOException during write!
Jul 9 03:00:30 cassandra: java.io.IOException: Broken pipe
Jul 9 03:00:30 cassandra: at sun.nio.ch.FileDispatcherImpl.write0(Native
Method) ~[na:1.8.0_172]
Jul 9 03:00:30 cassandra: at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_172]
Jul 9 03:00:30 cassandra: at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_172]
Jul 9 03:00:30 cassandra: at sun.nio.ch.IOUtil.write(IOUtil.java:65)
~[na:1.8.0_172]
Jul 9 03:00:30 cassandra: at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_172]
Jul 9 03:00:30 cassandra: at
org.apache.thrift.transport.TNonblockingSocket.write(TNonblockingSocket.java:165)
~[libthrift-0.9.2.jar:0.9.2]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.util.mem.Buffer.writeTo(Buffer.java:104)
~[thrift-server-0.3.7.jar:na]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.streamTo(FastMemoryOutputTransport.java:112)
~[thrift-server-0.3.7.jar:na]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.Message.write(Message.java:222)
~[thrift-server-0.3.7.jar:na]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.handleWrite(TDisruptorServer.java:598)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:569)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.java:423)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:30 cassandra: at
com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.run(TDisruptorServer.java:383)
[thrift-server-0.3.7.jar:na]
Jul 9 03:00:31 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
Jul 9 03:00:37 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
Jul 9 03:00:43 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
On Thu, 8 Aug 2019 at 15:42, ZAIDI, ASAD A
<[email protected]<mailto:[email protected]>> wrote:
Did you check if packets are NOT being dropped for network interfaces Cassandra
instances are consuming (ifconfig –a) internode compression is set for all
endpoint – may be network is playing any role here?
is this corruption limited so certain keyspace/table | DCs or is that wide
spread – the log snippet you shared it looked like only specific keyspace/table
is affected – is that correct?
When you remove corrupted sstable of a certain table, I guess you verifies all
nodes for corrupted sstables for same table (may be with with nodetool scrub
tool) so to limit spread of corruptions – right?
Just curious to know – you’re not using lz4/default compressor for all tables
there must be some reason for it.
From: Philip Ó Condúin
[mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, August 08, 2019 6:20 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: Datafile Corruption
Hi All,
Thank you so much for the replies.
Currently, I have the following list that can potentially cause some sort of
corruption in a Cassandra cluster.
* Sudden Power cut - We have had no power cuts in the datacenters
* Network Issues - no network issues from what I can tell
* Disk full - I don't think this is an issue for us, see disks below.
* An issue in Casandra version like Cassandra-13752 - couldn't find any
Jira issues similar to ours.
* Bit Flips - we have compression enabled so I don't think this should be
an issue.
* Repair during upgrade has caused corruption too - we have not upgraded
* Dropping and adding columns with the same name but a different type - I
will need to ask the apps team how they are using the database.
Ok, let me try and explain the issue we are having, I am under a lot of
pressure from above to get this fixed and I can't figure it out.
This is a PRE-PROD environment.
* 2 datacenters.
* 9 physical servers in each datacenter
* 4 Cassandra instances on each server
* 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in
site B.
We also have 2 Reaper Nodes we use for repair. One reaper node in each
datacenter each running with its own Cassandra back end in a cluster together.
OS Details [Red Hat Linux]
cass_a@x 0 10:53:01 ~ $ uname -a
Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64
x86_64 x86_64 GNU/Linux
cass_a@x 0 10:57:31 ~ $ cat /etc/*release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
Storage Layout
cass_a@xx 0 10:46:28 ~ $ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg01-lv_root 20G 2.2G 18G 11% /
devtmpfs 63G 0 63G 0% /dev
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 63G 4.1G 59G 7% /run
tmpfs 63G 0 63G 0% /sys/fs/cgroup
>> 4 cassandra instances
/dev/sdd 1.5T 802G 688G 54% /data/ssd4
/dev/sda 1.5T 798G 692G 54% /data/ssd1
/dev/sdb 1.5T 681G 810G 46% /data/ssd2
/dev/sdc 1.5T 558G 932G 38% /data/ssd3
/data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db
Cassandra load is about 200GB and the rest of the space is snapshots
CPU
cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
CPU(s): 64
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 2
Description of problem:
During repair of the cluster, we are seeing multiple corruptions in the log
files on a lot of instances. There seems to be no pattern to the corruption.
It seems that the repair job is finding all the corrupted files for us. The
repair will hang on the node where the corrupted file is found. To fix this we
remove/rename the datafile and bounce the Cassandra instance. Our hardware/OS
team have stated there is no problem on their side. I do not believe it the
repair causing the corruption.
We have maintenance scripts that run every night running compactions and
creating snapshots, I decided to turn these off, fix any corruptions we
currently had and ran major compaction on all nodes, once this was done we had
a "clean" cluster and we left the cluster for a few days. After the process we
noticed one corruption in the cluster, this datafile was created after I turned
off the maintenance scripts so my theory of the scripts causing the issue was
wrong. We then kicked off another repair and started to find more corrupt
files created after the maintenance script was turned off.
So let me give you an example of a corrupted file and maybe someone might be
able to work through it with me?
When this corrupted file was reported in the log it looks like it was the
repair that found it.
$ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until
"2019-08-07 22:45:00"
Aug 07 22:30:33 cassandra[34611]: INFO 21:30:33 Writing
Memtable-compactions_in_progress@830377457(0.008KiB<mailto:Memtable-compactions_in_progress@830377457(0.008KiB>
serialized bytes, 1 ops, 0%/0% of on/off-heap limit)
Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle tree
for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x,
(-1476350953672479093,-1474461
Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread
Thread[ValidationExecutor:825,1,main]
Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError:
org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
/x/ssd2/data/KeyspaceMetadata/x-1e453cb0
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:365)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:361)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:340)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:366)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:81)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:169)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
~[guava-16.0.jar:na]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:174)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.compaction.LazilyCompactedRow.update(LazilyCompactedRow.java:187)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.repair.Validator.rowHash(Validator.java:201)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.repair.Validator.add(Validator.java:150)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1166)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:76)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.db.compaction.CompactionManager$10.call(CompactionManager.java:736)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_172]
Aug 07 22:30:33 cassandra[34611]: at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[na:1.8.0_172]
Aug 07 22:30:33 cassandra[34611]: at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_172]
Aug 07 22:30:33 cassandra[34611]: at java.lang.Thread.run(Thread.java:748)
[na:1.8.0_172]
Aug 07 22:30:33 cassandra[34611]: Caused by:
org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
/data/ssd2/data/KeyspaceMetadata/x-x/l
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBufferMmap(CompressedRandomAccessReader.java:216)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:226)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.compress.CompressedThrottledReader.reBuffer(CompressedThrottledReader.java:42)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:352)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: ... 27 common frames omitted
Aug 07 22:30:33 cassandra[34611]: Caused by:
org.apache.cassandra.io.compress.CorruptBlockException:
(/data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big
Aug 07 22:30:33 cassandra[34611]: at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBufferMmap(CompressedRandomAccessReader.java:185)
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: ... 30 common frames omitted
Aug 07 22:30:33 cassandra[34611]: INFO 21:30:33 Not a global repair, will not
do anticompaction
Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Stopping gossiper
Aug 07 22:30:33 cassandra[34611]: WARN 21:30:33 Stopping gossip by operator
request
Aug 07 22:30:33 cassandra[34611]: INFO 21:30:33 Announcing shutdown
Aug 07 22:30:33 cassandra[34611]: INFO 21:30:33 Node
/10.2.57.37<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.2.57.37&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=4S7F10IxFntsiwIo-XT-YrkZE8312_yG8jMeOqOBjpE&s=20PLv0KNcUpBbyY1byoboimsLRjbPCLV76xL37jpttQ&e=>
state jump to shutdown
So I went to the file system to see when this corrupt file was created and it
was created on July 30th at 15.55
root@x 0 01:14:03 ~ # ls -l
/data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db
-rw-r--r-- 1 cass_b cass_b 3182243670 Jul 30 15:55
/data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db
So I checked /var/log/messages for errors around that time
The only thing that stands out to me is the message "Cannot perform a full
major compaction as repaired and unrepaired sstables cannot be compacted
together", I'm not sure if this would be an issue though and cause corruption.
Jul 30 15:55:06 x systemd: Created slice User Slice of root.
Jul 30 15:55:06 x systemd: Started Session c165280 of user root.
Jul 30 15:55:06 x audispd: node=x. type=USER_START
msg=audit(1564498506.021:457933): pid=17533 uid=0 auid=4294967295
ses=4294967295 msg='op=PAM:session_open
grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_tty_audit,pam_systemd,pam_unix
acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
Jul 30 15:55:06 x systemd: Removed slice User Slice of root.
Jul 30 15:55:14 x tag_audit_log: type=USER_CMD
msg=audit(1564498506.013:457932): pid=17533 uid=509 auid=4294967295
ses=4294967295 msg='cwd="/"
cmd=2F7573722F7362696E2F69706D692D73656E736F7273202D2D71756965742D6361636865202D2D7364722D63616368652D7265637265617465202D2D696E746572707265742D6F656D2D64617461202D2D6F75747075742D73656E736F722D7374617465202D2D69676E6F72652D6E6F742D617661696C61626C652D73656E736F7273202D2D6F75747075742D73656E736F722D7468726573686F6C6473
terminal=? res=success'
Jul 30 15:55:14 x tag_audit_log: type=USER_START
msg=audit(1564498506.021:457933): pid=17533 uid=0 auid=4294967295
ses=4294967295 msg='op=PAM:session_open
grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_tty_audit,pam_systemd,pam_unix
acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
Jul 30 15:55:19 x cassandra: INFO 14:55:19 Writing
Memtable-compactions_in_progress@1462227999(0.008KiB<mailto:Memtable-compactions_in_progress@1462227999(0.008KiB>
serialized bytes, 1 ops, 0%/0% of on/off-heap limit)
Jul 30 15:55:19 x cassandra: INFO 14:55:19 Cannot perform a full major
compaction as repaired and unrepaired sstables cannot be compacted together.
These two set of sstables will be compacted separately.
Jul 30 15:55:19 x cassandra: INFO 14:55:19 Writing
Memtable-compactions_in_progress@1198535528(1.002KiB<mailto:Memtable-compactions_in_progress@1198535528(1.002KiB>
serialized bytes, 57 ops, 0%/0% of on/off-heap limit)
Jul 30 15:55:20 x cassandra: INFO 14:55:20 Writing
Memtable-compactions_in_progress@2039409834(0.008KiB<mailto:Memtable-compactions_in_progress@2039409834(0.008KiB>
serialized bytes, 1 ops, 0%/0% of on/off-heap limit)
Jul 30 15:55:24 x audispd: node=x. type=USER_LOGOUT
msg=audit(1564498524.409:457934): pid=46620 uid=0 auid=464400029 ses=2747
msg='op=login id=464400029 exe="/usr/sbin/sshd" hostname=? addr=?
terminal=/dev/pts/0 res=success'
Jul 30 15:55:24 x audispd: node=x. type=USER_LOGOUT
msg=audit(1564498524.409:457935): pid=4878 uid=0 auid=464400029 ses=2749
msg='op=login id=464400029 exe="/usr/sbin/sshd" hostname=? addr=?
terminal=/dev/pts/1 res=success'
Jul 30 15:55:57 x systemd: Created slice User Slice of root.
Jul 30 15:55:57 x systemd: Started Session c165288 of user root.
Jul 30 15:55:57 x audispd: node=x. type=USER_START
msg=audit(1564498557.294:457958): pid=19687 uid=0 auid=4294967295
ses=4294967295 msg='op=PAM:session_open
grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_tty_audit,pam_systemd,pam_unix
acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
Jul 30 15:55:57 x audispd: node=x. type=USER_START
msg=audit(1564498557.298:457959): pid=19690 uid=0 auid=4294967295
ses=4294967295 msg='op=PAM:session_open
grantors=pam_keyinit,pam_systemd,pam_keyinit,pam_limits,pam_unix acct="cass_b"
exe="/usr/sbin/runuser" hostname=? addr=? terminal=? res=success'
Jul 30 15:55:58 x systemd: Removed slice User Slice of root.
Jul 30 15:56:02 x cassandra: INFO 14:56:02 Writing
Memtable-compactions_in_progress@1532791194(0.008KiB<mailto:Memtable-compactions_in_progress@1532791194(0.008KiB>
serialized bytes, 1 ops, 0%/0% of on/off-heap limit)
Jul 30 15:56:02 x cassandra: INFO 14:56:02 Cannot perform a full major
compaction as repaired and unrepaired sstables cannot be compacted together.
These two set of sstables will be compacted separately.
Jul 30 15:56:02 x cassandra: INFO 14:56:02 Writing
Memtable-compactions_in_progress@1455399453(0.281KiB<mailto:Memtable-compactions_in_progress@1455399453(0.281KiB>
serialized bytes, 16 ops, 0%/0% of on/off-heap limit)
Jul 30 15:56:04 x tag_audit_log: type=USER_CMD
msg=audit(1564498555.190:457951): pid=19294 uid=509 auid=4294967295
ses=4294967295 msg='cwd="/"
cmd=72756E75736572202D73202F62696E2F62617368202D6C20636173735F62202D632063617373616E6472612D6D6574612F63617373616E6472612F62696E2F6E6F6465746F6F6C2074707374617473
terminal=? res=success'
We have checked a number of other things like NTP setting etc but nothing is
telling us what could cause so many corruptions across the entire cluster.
Things were healthy with this cluster for months, the only thing I can think is
that we started loading data from a load of 20GB per instance up to 200GB where
it sits now, maybe this just highlighted the issue.
Compaction and Compression on Keyspace CL's [mixture]
All CF's are using compression.
AND compaction = {'min_threshold': '4', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.SnappyCompressor'}
AND compaction = {'min_threshold': '4', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND compaction = {'class':
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.SnappyCompressor'}
--We are also using internode network compression:
internode_compression: all
Does anyone have any idea what I should check next?
Our next theory is that there may be an issue with Checksum, but I'm not sure
where to go with this.
Any help would be very much appreciated before I lose the last bit of hair I
have on my head.
Kind Regards,
Phil
On Wed, 7 Aug 2019 at 20:51, Nitan Kainth
<[email protected]<mailto:[email protected]>> wrote:
Repair during upgrade have caused corruption too.
Also, dropping and adding columns with same name but different type
Regards,
Nitan
Cell: 510 449 9629<tel:510%20449%209629>
On Aug 7, 2019, at 2:42 PM, Jeff Jirsa
<[email protected]<mailto:[email protected]>> wrote:
Is compression enabled?
If not, bit flips on disk can corrupt data files and reads + repair may send
that corruption to other hosts in the cluster
On Aug 7, 2019, at 3:46 AM, Philip Ó Condúin
<[email protected]<mailto:[email protected]>> wrote:
Hi All,
I am currently experiencing multiple datafile corruptions across most nodes in
my cluster, there seems to be no pattern to the corruption. I'm starting to
think it might be a bug, we're using Cassandra 2.2.13.
Without going into detail about the issue I just want to confirm something.
Can someone share with me a list of scenarios that would cause corruption?
1. OS failure
2. Cassandra disturbed during the writing
etc etc.
I need to investigate each scenario and don't want to leave any out.
--
Regards,
Phil
--
Regards,
Phil
--
Regards,
Phil
--
Regards,
Phil