Re: Datafile Corruption

Jon Haddad Thu, 08 Aug 2019 09:35:28 -0700

Any chance you're using NVMe with an older Linux kernel?  I've seen a *lot*
filesystem errors from using older CentOS versions.  You'll want to be
using a version > 4.15.


On Thu, Aug 8, 2019 at 9:31 AM Philip Ó Condúin <philipocond...@gmail.com>
wrote:

> *@Jeff *- If it was hardware that would explain it all, but do you think
> it's possible to have every server in the cluster with a hardware issue?
> The data is sensitive and the customer would lose their mind if I sent it
> off-site which is a pity cause I could really do with the help.
> The corruption is occurring irregularly on every server and instance and
> column family in the cluster.  Out of 72 instances, we are getting maybe 10
> corrupt files per day.
> We are using vnodes (256) and it is happening in both DC's
>
> *@Asad *- internode compression is set to ALL on every server.  I have
> checked the packets for the private interconnect and I can't see any
> dropped packets, there are dropped packets for other interfaces, but not
> for the private ones, I will get the network team to double-check this.
> The corruption is only on the application schema, we are not getting
> corruption on any system or cass keyspaces.  Corruption is happening in
> both DC's.  We are getting corruption for the 1 application schema we have
> across all tables in the keyspace, it's not limited to one table.
> Im not sure why the app team decided to not use default compression, I
> must ask them.
>
>
>
> I have been checking the /var/log/messages today going back a few weeks
> and can see a serious amount of broken pipe errors across all servers and
> instances.
> Here is a snippet from one server but most pipe errors are similar:
>
> Jul  9 03:00:08  cassandra: INFO  02:00:08 Writing
> Memtable-sstable_activity@1126262628(43.631KiB serialized bytes, 18072
> ops, 0%/0% of on/off-heap limit)
> Jul  9 03:00:13  kernel: fnic_handle_fip_timer: 8 callbacks suppressed
> Jul  9 03:00:19  kernel: fnic_handle_fip_timer: 8 callbacks suppressed
> Jul  9 03:00:22  cassandra: ERROR 02:00:22 Got an IOException during write!
> Jul  9 03:00:22  cassandra: java.io.IOException: Broken pipe
> Jul  9 03:00:22  cassandra: at sun.nio.ch.FileDispatcherImpl.write0(Native
> Method) ~[na:1.8.0_172]
> Jul  9 03:00:22  cassandra: at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_172]
> Jul  9 03:00:22  cassandra: at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_172]
> Jul  9 03:00:22  cassandra: at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> ~[na:1.8.0_172]
> Jul  9 03:00:22  cassandra: at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
> ~[na:1.8.0_172]
> Jul  9 03:00:22  cassandra: at
> org.apache.thrift.transport.TNonblockingSocket.write(TNonblockingSocket.java:165)
> ~[libthrift-0.9.2.jar:0.9.2]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.util.mem.Buffer.writeTo(Buffer.java:104)
> ~[thrift-server-0.3.7.jar:na]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.streamTo(FastMemoryOutputTransport.java:112)
> ~[thrift-server-0.3.7.jar:na]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.Message.write(Message.java:222)
> ~[thrift-server-0.3.7.jar:na]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.handleWrite(TDisruptorServer.java:598)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:569)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.java:423)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:22  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.run(TDisruptorServer.java:383)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:25  kernel: fnic_handle_fip_timer: 8 callbacks suppressed
> Jul  9 03:00:30  cassandra: ERROR 02:00:30 Got an IOException during write!
> Jul  9 03:00:30  cassandra: java.io.IOException: Broken pipe
> Jul  9 03:00:30  cassandra: at sun.nio.ch.FileDispatcherImpl.write0(Native
> Method) ~[na:1.8.0_172]
> Jul  9 03:00:30  cassandra: at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_172]
> Jul  9 03:00:30  cassandra: at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_172]
> Jul  9 03:00:30  cassandra: at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> ~[na:1.8.0_172]
> Jul  9 03:00:30  cassandra: at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
> ~[na:1.8.0_172]
> Jul  9 03:00:30  cassandra: at
> org.apache.thrift.transport.TNonblockingSocket.write(TNonblockingSocket.java:165)
> ~[libthrift-0.9.2.jar:0.9.2]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.util.mem.Buffer.writeTo(Buffer.java:104)
> ~[thrift-server-0.3.7.jar:na]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.streamTo(FastMemoryOutputTransport.java:112)
> ~[thrift-server-0.3.7.jar:na]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.Message.write(Message.java:222)
> ~[thrift-server-0.3.7.jar:na]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.handleWrite(TDisruptorServer.java:598)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:569)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.java:423)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:30  cassandra: at
> com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.run(TDisruptorServer.java:383)
> [thrift-server-0.3.7.jar:na]
> Jul  9 03:00:31  kernel: fnic_handle_fip_timer: 8 callbacks suppressed
> Jul  9 03:00:37  kernel: fnic_handle_fip_timer: 8 callbacks suppressed
> Jul  9 03:00:43  kernel: fnic_handle_fip_timer: 8 callbacks suppressed
>
>
>
> On Thu, 8 Aug 2019 at 15:42, ZAIDI, ASAD A <az1...@att.com> wrote:
>
>> Did you check if packets are NOT being dropped for network interfaces
>> Cassandra instances are consuming (ifconfig –a) internode compression is
>> set for all endpoint – may be network is playing any role here?
>>
>> is this corruption limited so certain keyspace/table | DCs or is that
>> wide spread – the log snippet you shared it looked like only specific
>> keyspace/table is affected – is that correct?
>>
>> When you remove corrupted sstable of a certain table, I guess you
>> verifies all nodes for corrupted sstables for same table (may be with with
>> nodetool scrub tool) so to limit spread of corruptions – right?
>>
>> Just curious to know – you’re not using lz4/default compressor for all
>> tables there must be some reason for it.
>>
>>
>>
>>
>>
>>
>>
>> *From:* Philip Ó Condúin [mailto:philipocond...@gmail.com]
>> *Sent:* Thursday, August 08, 2019 6:20 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Datafile Corruption
>>
>>
>>
>> Hi All,
>>
>> Thank you so much for the replies.
>>
>> Currently, I have the following list that can potentially cause some sort
>> of corruption in a Cassandra cluster.
>>
>>    - Sudden Power cut  -  *We have had no power cuts in the datacenters*
>>    - Network Issues - *no network issues from what I can tell*
>>    - Disk full - *I don't think this is an issue for us, see disks
>>    below.*
>>    - An issue in Casandra version like Cassandra-13752 -* couldn't find
>>    any Jira issues similar to ours.*
>>    - Bit Flips -* we have compression enabled so I don't think this
>>    should be an issue.*
>>    - Repair during upgrade has caused corruption too -* we have not
>>    upgraded*
>>    - Dropping and adding columns with the same name but a different type
>>    - *I will need to ask the apps team how they are using the database.*
>>
>>
>>
>> Ok, let me try and explain the issue we are having, I am under a lot of
>> pressure from above to get this fixed and I can't figure it out.
>>
>> This is a PRE-PROD environment.
>>
>>    - 2 datacenters.
>>    - 9 physical servers in each datacenter
>>    - 4 Cassandra instances on each server
>>    - 72 Cassandra instances across the 2 data centres, 36 in site A, 36
>>    in site B.
>>
>>
>> We also have 2 Reaper Nodes we use for repair.  One reaper node in each
>> datacenter each running with its own Cassandra back end in a cluster
>> together.
>>
>> OS Details [Red Hat Linux]
>> cass_a@x 0 10:53:01 ~ $ uname -a
>> Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> cass_a@x 0 10:57:31 ~ $ cat /etc/*release
>> NAME="Red Hat Enterprise Linux Server"
>> VERSION="7.6 (Maipo)"
>> ID="rhel"
>>
>> Storage Layout
>> cass_a@xx 0 10:46:28 ~ $ df -h
>> Filesystem                         Size  Used Avail Use% Mounted on
>> /dev/mapper/vg01-lv_root            20G  2.2G   18G  11% /
>> devtmpfs                            63G     0   63G   0% /dev
>> tmpfs                               63G     0   63G   0% /dev/shm
>> tmpfs                               63G  4.1G   59G   7% /run
>> tmpfs                               63G     0   63G   0% /sys/fs/cgroup
>> >> 4 cassandra instances
>> /dev/sdd                           1.5T  802G  688G  54% /data/ssd4
>> /dev/sda                           1.5T  798G  692G  54% /data/ssd1
>> /dev/sdb                           1.5T  681G  810G  46% /data/ssd2
>> /dev/sdc                           1.5T  558G  932G  38% /data/ssd3
>>
>> Cassandra load is about 200GB and the rest of the space is snapshots
>>
>> CPU
>> cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
>> CPU(s):                64
>> Thread(s) per core:    2
>> Core(s) per socket:    16
>> Socket(s):             2
>>
>> *Description of problem:*
>> During repair of the cluster, we are seeing multiple corruptions in the
>> log files on a lot of instances.  There seems to be no pattern to the
>> corruption.  It seems that the repair job is finding all the corrupted
>> files for us.  The repair will hang on the node where the corrupted file is
>> found.  To fix this we remove/rename the datafile and bounce the Cassandra
>> instance.  Our hardware/OS team have stated there is no problem on their
>> side.  I do not believe it the repair causing the corruption.
>>
>> We have maintenance scripts that run every night running compactions and
>> creating snapshots, I decided to turn these off, fix any corruptions we
>> currently had and ran major compaction on all nodes, once this was done we
>> had a "clean" cluster and we left the cluster for a few days.  After the
>> process we noticed one corruption in the cluster, this datafile was created
>> after I turned off the maintenance scripts so my theory of the scripts
>> causing the issue was wrong.  We then kicked off another repair and started
>> to find more corrupt files created after the maintenance script was turned
>> off.
>>
>>
>> So let me give you an example of a corrupted file and maybe someone might
>> be able to work through it with me?
>>
>> When this corrupted file was reported in the log it looks like it was the
>> repair that found it.
>>
>> $ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00"
>> --until "2019-08-07 22:45:00"
>>
>> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Writing
>> Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1
>> ops, 0%/0% of on/off-heap limit)
>> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle
>> tree for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on
>> KeyspaceMetadata/x, (-1476350953672479093,-1474461
>> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread
>> Thread[ValidationExecutor:825,1,main]
>> Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError:
>> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
>> /x/ssd2/data/KeyspaceMetadata/x-1e453cb0
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:365)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:361)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:340)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:366)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:81)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:169)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> ~[guava-16.0.jar:na]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:174)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.compaction.LazilyCompactedRow.update(LazilyCompactedRow.java:187)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.repair.Validator.rowHash(Validator.java:201)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.repair.Validator.add(Validator.java:150)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1166)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:76)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.db.compaction.CompactionManager$10.call(CompactionManager.java:736)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_172]
>> Aug 07 22:30:33 cassandra[34611]: at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> ~[na:1.8.0_172]
>> Aug 07 22:30:33 cassandra[34611]: at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> [na:1.8.0_172]
>> Aug 07 22:30:33 cassandra[34611]: at
>> java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
>> Aug 07 22:30:33 cassandra[34611]: Caused by:
>> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
>> /data/ssd2/data/KeyspaceMetadata/x-x/l
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBufferMmap(CompressedRandomAccessReader.java:216)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:226)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.compress.CompressedThrottledReader.reBuffer(CompressedThrottledReader.java:42)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:352)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: ... 27 common frames omitted
>> Aug 07 22:30:33 cassandra[34611]: Caused by:
>> org.apache.cassandra.io.compress.CorruptBlockException:
>> (/data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big
>> Aug 07 22:30:33 cassandra[34611]: at
>> org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBufferMmap(CompressedRandomAccessReader.java:185)
>> ~[apache-cassandra-2.2.13.jar:2.2.13]
>> Aug 07 22:30:33 cassandra[34611]: ... 30 common frames omitted
>> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Not a global repair,
>> will not do anticompaction
>> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Stopping gossiper
>> Aug 07 22:30:33 cassandra[34611]: WARN  21:30:33 Stopping gossip by
>> operator request
>> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Announcing shutdown
>> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Node /10.2.57.37
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.2.57.37&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=FsmDztdsVuIKml8IDhdHdg&m=4S7F10IxFntsiwIo-XT-YrkZE8312_yG8jMeOqOBjpE&s=20PLv0KNcUpBbyY1byoboimsLRjbPCLV76xL37jpttQ&e=>
>> state jump to shutdown
>>
>>
>> So I went to the file system to see when this corrupt file was created
>> and it was created on July 30th at 15.55
>>
>> root@x 0 01:14:03 ~ # ls -l
>> /data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db
>> -rw-r--r-- 1 cass_b cass_b 3182243670 Jul 30 15:55
>> /data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db
>>
>>
>>
>> So I checked /var/log/messages for errors around that time
>> The only thing that stands out to me is the message "Cannot perform a
>> full major compaction as repaired and unrepaired sstables cannot be
>> compacted together", I'm not sure if this would be an issue though and
>> cause corruption.
>>
>> Jul 30 15:55:06 x systemd: Created slice User Slice of root.
>> Jul 30 15:55:06 x systemd: Started Session c165280 of user root.
>> Jul 30 15:55:06 x audispd: node=x. type=USER_START
>> msg=audit(1564498506.021:457933): pid=17533 uid=0 auid=4294967295
>> ses=4294967295 msg='op=PAM:session_open
>> grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_tty_audit,pam_systemd,pam_unix
>> acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
>> Jul 30 15:55:06 x systemd: Removed slice User Slice of root.
>> Jul 30 15:55:14 x tag_audit_log: type=USER_CMD
>> msg=audit(1564498506.013:457932): pid=17533 uid=509 auid=4294967295
>> ses=4294967295 msg='cwd="/"
>> cmd=2F7573722F7362696E2F69706D692D73656E736F7273202D2D71756965742D6361636865202D2D7364722D63616368652D7265637265617465202D2D696E746572707265742D6F656D2D64617461202D2D6F75747075742D73656E736F722D7374617465202D2D69676E6F72652D6E6F742D617661696C61626C652D73656E736F7273202D2D6F75747075742D73656E736F722D7468726573686F6C6473
>> terminal=? res=success'
>> Jul 30 15:55:14 x tag_audit_log: type=USER_START
>> msg=audit(1564498506.021:457933): pid=17533 uid=0 auid=4294967295
>> ses=4294967295 msg='op=PAM:session_open
>> grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_tty_audit,pam_systemd,pam_unix
>> acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
>> Jul 30 15:55:19 x cassandra: INFO  14:55:19 Writing
>> Memtable-compactions_in_progress@1462227999(0.008KiB serialized bytes, 1
>> ops, 0%/0% of on/off-heap limit)
>> Jul 30 15:55:19 x cassandra: INFO  14:55:19 Cannot perform a full major
>> compaction as repaired and unrepaired sstables cannot be compacted
>> together. These two set of sstables will be compacted separately.
>> Jul 30 15:55:19 x cassandra: INFO  14:55:19 Writing
>> Memtable-compactions_in_progress@1198535528(1.002KiB serialized bytes,
>> 57 ops, 0%/0% of on/off-heap limit)
>> Jul 30 15:55:20 x cassandra: INFO  14:55:20 Writing
>> Memtable-compactions_in_progress@2039409834(0.008KiB serialized bytes, 1
>> ops, 0%/0% of on/off-heap limit)
>> Jul 30 15:55:24 x audispd: node=x. type=USER_LOGOUT
>> msg=audit(1564498524.409:457934): pid=46620 uid=0 auid=464400029 ses=2747
>> msg='op=login id=464400029 exe="/usr/sbin/sshd" hostname=? addr=?
>> terminal=/dev/pts/0 res=success'
>> Jul 30 15:55:24 x audispd: node=x. type=USER_LOGOUT
>> msg=audit(1564498524.409:457935): pid=4878 uid=0 auid=464400029 ses=2749
>> msg='op=login id=464400029 exe="/usr/sbin/sshd" hostname=? addr=?
>> terminal=/dev/pts/1 res=success'
>>
>> Jul 30 15:55:57 x systemd: Created slice User Slice of root.
>> Jul 30 15:55:57 x systemd: Started Session c165288 of user root.
>> Jul 30 15:55:57 x audispd: node=x. type=USER_START
>> msg=audit(1564498557.294:457958): pid=19687 uid=0 auid=4294967295
>> ses=4294967295 msg='op=PAM:session_open
>> grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_tty_audit,pam_systemd,pam_unix
>> acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
>> Jul 30 15:55:57 x audispd: node=x. type=USER_START
>> msg=audit(1564498557.298:457959): pid=19690 uid=0 auid=4294967295
>> ses=4294967295 msg='op=PAM:session_open
>> grantors=pam_keyinit,pam_systemd,pam_keyinit,pam_limits,pam_unix
>> acct="cass_b" exe="/usr/sbin/runuser" hostname=? addr=? terminal=?
>> res=success'
>> Jul 30 15:55:58 x systemd: Removed slice User Slice of root.
>> Jul 30 15:56:02 x cassandra: INFO  14:56:02 Writing
>> Memtable-compactions_in_progress@1532791194(0.008KiB serialized bytes, 1
>> ops, 0%/0% of on/off-heap limit)
>> Jul 30 15:56:02 x cassandra: INFO  14:56:02 Cannot perform a full major
>> compaction as repaired and unrepaired sstables cannot be compacted
>> together. These two set of sstables will be compacted separately.
>> Jul 30 15:56:02 x cassandra: INFO  14:56:02 Writing
>> Memtable-compactions_in_progress@1455399453(0.281KiB serialized bytes,
>> 16 ops, 0%/0% of on/off-heap limit)
>> Jul 30 15:56:04 x tag_audit_log: type=USER_CMD
>> msg=audit(1564498555.190:457951): pid=19294 uid=509 auid=4294967295
>> ses=4294967295 msg='cwd="/"
>> cmd=72756E75736572202D73202F62696E2F62617368202D6C20636173735F62202D632063617373616E6472612D6D6574612F63617373616E6472612F62696E2F6E6F6465746F6F6C2074707374617473
>> terminal=? res=success'
>>
>>
>>
>> We have checked a number of other things like NTP setting etc but nothing
>> is telling us what could cause so many corruptions across the entire
>> cluster.
>> Things were healthy with this cluster for months, the only thing I can
>> think is that we started loading data from a load of 20GB per instance up
>> to 200GB where it sits now, maybe this just highlighted the issue.
>>
>>
>>
>> Compaction and Compression on Keyspace CL's [mixture]
>> All CF's are using compression.
>>
>> AND compaction = {'min_threshold': '4', 'class':
>> 'org.apache.cassandra.db.compaction.*SizeTieredCompactionStrategy*',
>> 'max_threshold': '32'}
>> AND compression = {'sstable_compression':
>> 'org.apache.cassandra.io.compress.*SnappyCompressor*'}
>>
>> AND compaction = {'min_threshold': '4', 'class':
>> 'org.apache.cassandra.db.compaction.*SizeTieredCompactionStrategy*',
>> 'max_threshold': '32'}
>> AND compression = {'sstable_compression':
>> 'org.apache.cassandra.io.compress.*LZ4Compressor*'}
>>
>> AND compaction = {'class': 'org.apache.cassandra.db.compaction.
>> *LeveledCompactionStrategy*'}
>> AND compression = {'sstable_compression':
>> 'org.apache.cassandra.io.compress.*SnappyCompressor*'}
>>
>> --We are also using internode network compression:
>> internode_compression: all
>>
>>
>>
>> Does anyone have any idea what I should check next?
>> Our next theory is that there may be an issue with Checksum, but I'm not
>> sure where to go with this.
>>
>>
>>
>> Any help would be very much appreciated before I lose the last bit of
>> hair I have on my head.
>>
>>
>>
>> Kind Regards,
>>
>> Phil
>>
>>
>>
>> On Wed, 7 Aug 2019 at 20:51, Nitan Kainth <nitankai...@gmail.com> wrote:
>>
>> Repair during upgrade have caused corruption too.
>>
>>
>>
>> Also, dropping and adding columns with same name but different type
>>
>>
>>
>> Regards,
>>
>> Nitan
>>
>> Cell: 510 449 9629
>>
>>
>> On Aug 7, 2019, at 2:42 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>> Is compression enabled?
>>
>>
>>
>> If not, bit flips on disk can corrupt data files and reads + repair may
>> send that corruption to other hosts in the cluster
>>
>>
>> On Aug 7, 2019, at 3:46 AM, Philip Ó Condúin <philipocond...@gmail.com>
>> wrote:
>>
>> Hi All,
>>
>>
>>
>> I am currently experiencing multiple datafile corruptions across most
>> nodes in my cluster, there seems to be no pattern to the corruption.  I'm
>> starting to think it might be a bug, we're using Cassandra 2.2.13.
>>
>>
>>
>> Without going into detail about the issue I just want to confirm
>> something.
>>
>>
>>
>> Can someone share with me a list of scenarios that would cause corruption?
>>
>>
>>
>> 1. OS failure
>>
>> 2. Cassandra disturbed during the writing
>>
>>
>>
>> etc etc.
>>
>>
>>
>> I need to investigate each scenario and don't want to leave any out.
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Phil
>>
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Phil
>>
>
>
> --
> Regards,
> Phil
>

Re: Datafile Corruption

Reply via email to