New ticket for backporting, referencing the existing.

On Mon., 13 Aug. 2018, 22:50 Steinmaurer, Thomas, <
thomas.steinmau...@dynatrace.com> wrote:

> Thanks Kurt.
>
>
>
> What is the proper workflow here to get this accepted? Create a new ticket
> dedicated for the backport referencing 11540 or re-open 11540?
>
>
>
> Thanks for your help.
>
>
>
> Thomas
>
>
>
> *From:* kurt greaves <k...@instaclustr.com>
> *Sent:* Montag, 13. August 2018 13:24
> *To:* User <user@cassandra.apache.org>
> *Subject:* Re: Data Corruption due to multiple Cassandra 2.1 processes?
>
>
>
> Yeah that's not ideal and could lead to problems. I think corruption is
> only likely if compactions occur, but seems like data loss is a potential
> not to mention all sorts of other possible nasties that could occur running
> two C*'s at once. Seems to me that 11540 should have gone to 2.1 in the
> first place, but it just got missed. Very simple patch so I think a
> backport should be accepted.
>
>
>
> On 7 August 2018 at 15:57, Steinmaurer, Thomas <
> thomas.steinmau...@dynatrace.com> wrote:
>
> Hello,
>
>
>
> with 2.1, in case a second Cassandra process/instance is started on a host
> (by accident), may this result in some sort of corruption, although
> Cassandra will exit at some point in time due to not being able to bind TCP
> ports already in use?
>
>
>
> What we have seen in this scenario is something like that:
>
>
>
> ERROR [main] 2018-08-05 21:10:24,046 CassandraDaemon.java:120 - Error
> starting local jmx server:
>
> java.rmi.server.ExportException: Port already in use: 7199; nested
> exception is:
>
>                 java.net.BindException: Address already in use (Bind
> failed)
>
> …
>
>
>
> But then continuing with stuff like opening system and even user tables:
>
>
>
> INFO  [main] 2018-08-05 21:10:24,060 CacheService.java:110 - Initializing
> key cache with capacity of 100 MBs.
>
> INFO  [main] 2018-08-05 21:10:24,067 CacheService.java:132 - Initializing
> row cache with capacity of 0 MBs
>
> INFO  [main] 2018-08-05 21:10:24,073 CacheService.java:149 - Initializing
> counter cache with capacity of 50 MBs
>
> INFO  [main] 2018-08-05 21:10:24,074 CacheService.java:160 - Scheduling
> counter cache save to every 7200 seconds (going to save all keys).
>
> INFO  [main] 2018-08-05 21:10:24,161 ColumnFamilyStore.java:365 -
> Initializing system.sstable_activity
>
> INFO  [SSTableBatchOpen:2] 2018-08-05 21:10:24,692 SSTableReader.java:475
> - Opening
> /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-165
> (2023 bytes)
>
> INFO  [SSTableBatchOpen:3] 2018-08-05 21:10:24,692 SSTableReader.java:475
> - Opening
> /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-167
> (2336 bytes)
>
> INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,692 SSTableReader.java:475
> - Opening
> /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-166
> (2686 bytes)
>
> INFO  [main] 2018-08-05 21:10:24,755 ColumnFamilyStore.java:365 -
> Initializing system.hints
>
> INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,758 SSTableReader.java:475
> - Opening
> /var/opt/xxx-managed/cassandra/system/hints-2666e20573ef38b390fefecf96e8f0c7/system-hints-ka-377
> (46210621 bytes)
>
> INFO  [main] 2018-08-05 21:10:24,766 ColumnFamilyStore.java:365 -
> Initializing system.compaction_history
>
> INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,768 SSTableReader.java:475
> - Opening
> /var/opt/xxx-managed/cassandra/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-129
> (91269 bytes)
>
> …
>
>
>
> Replaying commit logs:
>
>
>
> …
>
> INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:267 -
> Replaying
> /var/opt/dynatrace-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
>
> INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:270 -
> Replaying
> /var/opt/dynatrace-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
> (CL version 4, messaging version 8)
>
> …
>
>
>
> Even writing memtables already (below just pasted system tables, but also
> user tables):
>
>
>
> …
>
> INFO  [MemtableFlushWriter:4] 2018-08-05 21:11:52,524 Memtable.java:347 -
> Writing Memtable-size_estimates@1941663179(2.655MiB serialized bytes,
> 325710 ops, 2%/0% of on/off-heap limit)
>
> INFO  [MemtableFlushWriter:3] 2018-08-05 21:11:52,552 Memtable.java:347 -
> Writing Memtable-peer_events@1474667699(0.199KiB serialized bytes, 4 ops,
> 0%/0% of on/off-heap limit)
>
> …
>
>
>
> Until it comes to a point where it can’t bind ports like the storage port
> 7000:
>
>
>
> ERROR [main] 2018-08-05 21:11:54,350 CassandraDaemon.java:395 - Fatal
> configuration error
>
> org.apache.cassandra.exceptions.ConfigurationException: /XXX:7000 is in
> use by another process.  Change listen_address:storage_port in
> cassandra.yaml to values that do not conflict with other services
>
>                 at org.apache.cassandra.net
> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Forg.apache.cassandra.net&data=01%7C01%7Cthomas.steinmaurer%40dynatrace.com%7Cc9c83c6a323b4b24cffa08d60110530c%7C70ebe3a35b30435d9d677716d74ca190%7C1&sdata=0ZMpjJBhOWznJH3SfvW%2BHBflSq%2F1Q4CCYvdy8jlcbT0%3D&reserved=0>.MessagingService.getServerSockets(MessagingService.java:495)
> ~[apache-cassandra-2.1.18.jar:2.1.18]
>
> …
>
>
>
> Until Cassandra stops:
>
>
>
> …
>
> INFO  [StorageServiceShutdownHook] 2018-08-05 21:11:54,361
> Gossiper.java:1454 - Announcing shutdown
>
> …
>
>
>
>
>
> So, we have around 2 minutes where Cassandra is mangling with existing
> data, although it shouldn’t.
>
>
>
> Sounds like a potential candidate for data corruption, right? E.g. later
> on we then see things like (still while being in progress to shutdown?):
>
>
>
> WARN  [SharedPool-Worker-1] 2018-08-05 21:11:58,181
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-1,5,main]: {}
>
> java.lang.RuntimeException: java.io.FileNotFoundException:
> /var/opt/xxx-managed/cassandra/xxx/xxx-fdc68b70950611e8ad7179f2d5bfa3cf/xxx-xxx-ka-15-Data.db
> (No such file or directory)
>
>                 at org.apache.cassandra.io
> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Forg.apache.cassandra.io&data=01%7C01%7Cthomas.steinmaurer%40dynatrace.com%7Cc9c83c6a323b4b24cffa08d60110530c%7C70ebe3a35b30435d9d677716d74ca190%7C1&sdata=emcFyFAYwIDkaS1DVC3f%2FZyY%2Bb77Qj4T%2Bdgr0mnjqEE%3D&reserved=0>.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:52)
> ~[apache-cassandra-2.1.18.jar:2.1.18]
>
>                 at
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createPooledReader(CompressedPoolingSegmentedFile.java:95)
> ~[apache-cassandra-2.1.18.jar:2.1.18]
>
>                 at
> org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:62)
> ~[apache-cassandra-2.1.18.jar:2.1.18]
>
> …
>
>
>
>
>
> I found this one here:
> https://issues.apache.org/jira/browse/CASSANDRA-11540
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCASSANDRA-11540&data=01%7C01%7Cthomas.steinmaurer%40dynatrace.com%7Cc9c83c6a323b4b24cffa08d60110530c%7C70ebe3a35b30435d9d677716d74ca190%7C1&sdata=K0fVRupYMv4%2B0uVjMQJ8jpd1dHIbYvFI%2BOGukAwJn7o%3D&reserved=0>
>
>
>
> So, if this all leads to corruption, might this be a candidate for a
> backport for a 2.1 bugfix release?
>
>
>
> Thanks a lot!
>
>
>
> Thomas
>
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
>

Reply via email to