[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430989#comment-16430989
 ] 

Vineet Ghatge commented on CASSANDRA-13696:
-------------------------------------------

[~jay.zhuang]: I am not sure if this the same issue. I appiled commit you have 
here on 3.11.1 and we still see the following issue on a 14 node cluster

Any ideas on what might be wrong?

ERROR [HintsDispatcher:1] 2018-04-06 16:26:44,423 CassandraDaemon.java:228 - 
Exception in thread Thread[HintsDispatcher:1,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:298)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:263)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:169) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:128)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:113) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:94) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:278)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:260)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:238)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:217)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_141]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_141]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[na:1.8.0_141]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_141]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
 [apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_141]
Caused by: java.io.IOException: Digest mismatch exception

To overcome this we do a truncatehints and everything works after that. Issue 
happens once a while we have seen it twice so far on 14 node cluster and 7 node 
cluster

 

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-13696
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jay Zhuang
>            Assignee: Jay Zhuang
>            Priority: Blocker
>             Fix For: 3.0.15, 3.11.1, 4.0
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
>     at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
>     at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
>     at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
>     at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
>     at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
>     at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
>     at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
>     ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to