[jira] [Comment Edited] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9

2017-01-11 Thread Harikrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819815#comment-15819815
 ] 

Harikrishnan edited comment on CASSANDRA-12844 at 1/12/17 1:33 AM:
---

Hi,
We reproduced this  two times , we were  trying to bring down a node by issuing 
nodetool drain. One interesting aspect is there were lot mutation drops and 
hint replay was happening to most of the nodes while drain is being issued.Will 
try to reproduce it again .


was (Author: hari708):
Hi,
We reproduced this  two times , we were  trying to bring down a node by issuing 
nodetool drain. One interesting aspect is there were lot mutation drops and 
hint replay was happening to most of the nodes while drain is being issued.

> nodetool drain causing mutiple nodes crashing with hint file corruption in 
> Cassandra 3.9
> 
>
> Key: CASSANDRA-12844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12844
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Harikrishnan
>Priority: Critical
>  Labels: hints
>
> The steps are as follows.
> we have 4/4 node cassandra running in 3.9 version.
> In one node made some changes to cassanra.yaml. issued a nodetool drain 
> killed the cassandra process and restarted the node. After sometime nodetool 
> status reported multiple nodes are down in that DC.
> Went and check the system.log of all the files and found the hint corruption 
> occuring(CASSANDRA-12728).  nodetool drain causing this corruption and 
> bringing multiple nodes down is a big concern.
> ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> 4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_102]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_102]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_102]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_102]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9

2017-01-11 Thread Harikrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819815#comment-15819815
 ] 

Harikrishnan commented on CASSANDRA-12844:
--

Hi,
We reproduced this  two times , we were  trying to bring down a node by issuing 
nodetool drain. One interesting aspect is there were lot mutation drops and 
hint replay was happening to most of the nodes while drain is being issued.

> nodetool drain causing mutiple nodes crashing with hint file corruption in 
> Cassandra 3.9
> 
>
> Key: CASSANDRA-12844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12844
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Harikrishnan
>Priority: Critical
>  Labels: hints
>
> The steps are as follows.
> we have 4/4 node cassandra running in 3.9 version.
> In one node made some changes to cassanra.yaml. issued a nodetool drain 
> killed the cassandra process and restarted the node. After sometime nodetool 
> status reported multiple nodes are down in that DC.
> Went and check the system.log of all the files and found the hint corruption 
> occuring(CASSANDRA-12728).  nodetool drain causing this corruption and 
> bringing multiple nodes down is a big concern.
> ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> 4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_102]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_102]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_102]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_102]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9

2016-10-26 Thread Harikrishnan (JIRA)
Harikrishnan created CASSANDRA-12844:


 Summary: nodetool drain causing mutiple nodes crashing with hint 
file corruption in Cassandra 3.9
 Key: CASSANDRA-12844
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12844
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Harikrishnan


The steps are as follows.
we have 4/4 node cassandra running in 3.9 version.
In one node made some changes to cassanra.yaml. issued a nodetool drain 
killed the cassandra process and restarted the node. After sometime nodetool 
status reported multiple nodes are down in that DC.

Went and check the system.log of all the files and found the hint corruption 
occuring(CASSANDRA-12728).  nodetool drain causing this corruption and bringing 
multiple nodes down is a big concern.




ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 
HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted 
({})
org.apache.cassandra.io.FSReadError: java.io.EOFException
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
 [apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
 [apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
 [apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
 [apache-cassandra-3.9.jar:3.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_102]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_102]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12728) Handling partially written hint files

2016-10-24 Thread Harikrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604272#comment-15604272
 ] 

Harikrishnan commented on CASSANDRA-12728:
--

Reproduced the same error in 3.9 also.Run nodetool drain in one node , 3 nodes 
went down with this error.

ERROR [HintsDispatcher:2] 2016-10-25 05:08:00,157 
HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
49c3290a-fafd-456c-966e-8bcd1eab9af8-1477371781565-1.hints: file is corrupted 
({})
org.apache.cassandra.io.FSReadError: java.io.EOFException
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
 [apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
 [apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
 [apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
 [apache-cassandra-3.9.jar:3.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_102]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
Caused by: java.io.EOFException: null
at 
org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:146)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.io.util.RebufferingInputStream.readPrimitiveSlowly(RebufferingInputStream.java:108)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.io.util.RebufferingInputStream.readInt(RebufferingInputStream.java:188)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:297)
 ~[apache-cassandra-3.9.jar:3.9]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:280)
 ~[apache-cassandra-3.9.jar:3.9]
... 15 common frames omitted

> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sharvanath Pathak
>Assignee: Aleksey Yeschenko
> Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
>