[jira] [Comment Edited] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819815#comment-15819815 ] Harikrishnan edited comment on CASSANDRA-12844 at 1/12/17 1:33 AM: --- Hi, We reproduced this two times , we were trying to bring down a node by issuing nodetool drain. One interesting aspect is there were lot mutation drops and hint replay was happening to most of the nodes while drain is being issued.Will try to reproduce it again . was (Author: hari708): Hi, We reproduced this two times , we were trying to bring down a node by issuing nodetool drain. One interesting aspect is there were lot mutation drops and hint replay was happening to most of the nodes while drain is being issued. > nodetool drain causing mutiple nodes crashing with hint file corruption in > Cassandra 3.9 > > > Key: CASSANDRA-12844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12844 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Harikrishnan >Priority: Critical > Labels: hints > > The steps are as follows. > we have 4/4 node cassandra running in 3.9 version. > In one node made some changes to cassanra.yaml. issued a nodetool drain > killed the cassandra process and restarted the node. After sometime nodetool > status reported multiple nodes are down in that DC. > Went and check the system.log of all the files and found the hint corruption > occuring(CASSANDRA-12728). nodetool drain causing this corruption and > bringing multiple nodes down is a big concern. > ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > 4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.9.jar:3.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_102] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_102] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_102] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_102] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819815#comment-15819815 ] Harikrishnan commented on CASSANDRA-12844: -- Hi, We reproduced this two times , we were trying to bring down a node by issuing nodetool drain. One interesting aspect is there were lot mutation drops and hint replay was happening to most of the nodes while drain is being issued. > nodetool drain causing mutiple nodes crashing with hint file corruption in > Cassandra 3.9 > > > Key: CASSANDRA-12844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12844 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Harikrishnan >Priority: Critical > Labels: hints > > The steps are as follows. > we have 4/4 node cassandra running in 3.9 version. > In one node made some changes to cassanra.yaml. issued a nodetool drain > killed the cassandra process and restarted the node. After sometime nodetool > status reported multiple nodes are down in that DC. > Went and check the system.log of all the files and found the hint corruption > occuring(CASSANDRA-12728). nodetool drain causing this corruption and > bringing multiple nodes down is a big concern. > ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > 4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.9.jar:3.9] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.9.jar:3.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_102] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_102] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_102] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_102] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9
Harikrishnan created CASSANDRA-12844: Summary: nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9 Key: CASSANDRA-12844 URL: https://issues.apache.org/jira/browse/CASSANDRA-12844 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Harikrishnan The steps are as follows. we have 4/4 node cassandra running in 3.9 version. In one node made some changes to cassanra.yaml. issued a nodetool drain killed the cassandra process and restarted the node. After sometime nodetool status reported multiple nodes are down in that DC. Went and check the system.log of all the files and found the hint corruption occuring(CASSANDRA-12728). nodetool drain causing this corruption and bringing multiple nodes down is a big concern. ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 HintsDispatchExecutor.java:225 - Failed to dispatch hints file 4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted ({}) org.apache.cassandra.io.FSReadError: java.io.EOFException at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) [apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) [apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) [apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) [apache-cassandra-3.9.jar:3.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12728) Handling partially written hint files
[ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604272#comment-15604272 ] Harikrishnan commented on CASSANDRA-12728: -- Reproduced the same error in 3.9 also.Run nodetool drain in one node , 3 nodes went down with this error. ERROR [HintsDispatcher:2] 2016-10-25 05:08:00,157 HintsDispatchExecutor.java:225 - Failed to dispatch hints file 49c3290a-fafd-456c-966e-8bcd1eab9af8-1477371781565-1.hints: file is corrupted ({}) org.apache.cassandra.io.FSReadError: java.io.EOFException at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) [apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) [apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) [apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) [apache-cassandra-3.9.jar:3.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102] Caused by: java.io.EOFException: null at org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:146) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.io.util.RebufferingInputStream.readPrimitiveSlowly(RebufferingInputStream.java:108) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.io.util.RebufferingInputStream.readInt(RebufferingInputStream.java:188) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:297) ~[apache-cassandra-3.9.jar:3.9] at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:280) ~[apache-cassandra-3.9.jar:3.9] ... 15 common frames omitted > Handling partially written hint files > - > > Key: CASSANDRA-12728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12728 > Project: Cassandra > Issue Type: Bug >Reporter: Sharvanath Pathak >Assignee: Aleksey Yeschenko > Attachments: CASSANDRA-12728.patch > > > {noformat} > ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at >