[ 
https://issues.apache.org/jira/browse/CASSANDRA-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436714#comment-16436714
 ] 

Jay Zhuang commented on CASSANDRA-14375:
----------------------------------------

We saw the same issue in {{3.0.14}} 2 times in the last one week:
{noformat}
ERROR [HintsDispatcher:1] 2018-04-10 23:43:47,930 
HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
d921cf74-c064-465d-82b4-aa964cb3b8f6-1523401451406-1.hints: file is corrupted 
({})
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:296)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:261)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:138)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_121]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_121]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
        at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
Caused by: java.io.IOException: Digest mismatch exception
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:313)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:287)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        ... 16 common frames omitted
ERROR [HintsDispatcher:1] 2018-04-10 23:43:47,931 CassandraDaemon.java:207 - 
Exception in thread Thread[HintsDispatcher:1,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:296)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:261)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:138)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_121]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
        at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
Caused by: java.io.IOException: Digest mismatch exception
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:313)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:287)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        ... 16 common frames omitted
ERROR [HintsDispatcher:1] 2018-04-10 23:43:47,932 
JVMStabilityInspector.java:144 - JVM state determined to be unstable.  Exiting 
forcefully due to:
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:296)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:261)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:138)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_121]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
        at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
Caused by: java.io.IOException: Digest mismatch exception
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:313)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:287)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
        ... 16 common frames omitted
{noformat}

Still trying to find the root cause, but corrupted hints should not crash 
Cassandra process, it should be handled the same way as corrupted sstables, 
which is logged and ignored by default: 
[DefaultFSErrorHandler.java:40|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DefaultFSErrorHandler.java#L40].

> Digest mismatch Exception when sending raw hints in cluster
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-14375
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14375
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hints
>         Environment: CentOS 7.3
>            Reporter: Vineet Ghatge
>            Priority: Major
>
> We have 14 nodes cluster where we seen hints file getting corrupted and 
> resulting in the following error
> ERROR [HintsDispatcher:1] 2018-04-06 16:26:44,423 CassandraDaemon.java:228 - 
> Exception in thread Thread[HintsDispatcher:1,1,main]
>  org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
>  at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:298)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:263)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:169)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:128)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:113) 
> ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:94) 
> ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:278)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:260)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:238)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:217)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_141]
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_141]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[na:1.8.0_141]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_141]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_141]
>  Caused by: java.io.IOException: Digest mismatch exception
>  at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:315)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:289)
>  ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
>  ... 16 common frames omitted
> Notes on cluster and investigation done so far
>  1. Cassandra used here is built locally from 3.11.1 branch along with 
> following patch from issue: CASSANDRA-14080
>  
> [https://github.com/apache/cassandra/commit/68079e4b2ed4e58dbede70af45414b3d4214e195]
>  2. The bootstrap of 14 nodes happens in the following way:
>  - Out of 14 nodes only 3 nodes are picked as seed nodes.
>  - Only 1 out 3 seed nodes is started and schema is created if it was not 
> created previously.
>  - Post this, rest of nodes are bootstrapped.
>  - In failure scenario, only 5 out of 14 succesfully formed the cassandra 
> cluster. The failed nodes include two seed nodes.
>  3. We confirmed the following patch from issue: CASSANDRA-13696 has been 
> applied. From confirmed from Jay Zhuang that this is different issue from 
> what was previously fixed.
>  "this should be a different issue, as HintsDispatcher.java:128 sends hints 
> with \{{buffer}}s, this patch is only to fix the digest mismatch for 
> HintsDispatcher.java:129, which sends hints one by one."
>  4. Application uses java driver with quoram setting for cassandra
>  5. We saw this issue on 7 node cluster too (different from 14 node cluster)
>  6. We are able to workaround by running nodetool truncatehints on failed 
> nodes and restarting cassandra.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to