[jira] [Commented] (CASSANDRA-10233) IndexOutOfBoundsException in HintedHandOffManager

Paulo Motta (JIRA) Mon, 05 Oct 2015 14:33:13 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944090#comment-14944090
 ]


Paulo Motta commented on CASSANDRA-10233:
-----------------------------------------

[~eitikimura] I'd prefer not add the try-catch block on 
{{HintedHandOffManager.scheduleAllDeliveries()}} as in the general case stored 
hints shouldn't be corrupted, and it could make hints be silently dropped which 
could lead to more serious issues. Since we already know the issue was caused 
on {{StorageProxy.writeHintForMutation}} I think it suffices to perform the 
check there. And if someone hit this bug before it's fixed, the workaround 
should be truncate hints + repair.

I think what you did on {{StorageProxy.writeHintForMutation}} looks awesome, 
but the thrown exception might be ignored silently by the hints executor, so 
it's better to perform an explicit check, log a warn and throw an 
AssertionError if {{hostId \!= null}}, so we'll be able to track if it happens 
again in the logs. Could you please make these changes and re-submit the patch? 
Please check if your patch apply to cassandra-2.2 branch, and if it doesn't 
please also submit a patch for 2.2. It should not be necessary to create a 
patch for 3.0, as the hints engine was rewritten from scratch.

Thanks for that [~eitikimura]!

[~fhsgoncalves] yep, afaik assertions should be optional in production, but 
they should never happen in the first place. probably this is being caused by 
some other issue I was not able to track in the latest changes, but 
[~eitikimura]'s patch should help us troubleshoot if it happens in the future.

[~nutbunnies] [~mambocab] maybe it would be interesting to have a dtest job 
with assertions disabled, since we rely a lot on assertions for pre-condition 
checking, and many people disable them in production.

> IndexOutOfBoundsException in HintedHandOffManager
> -------------------------------------------------
>
>                 Key: CASSANDRA-10233
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10233
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra 2.2.0
>            Reporter: Omri Iluz
>            Assignee: Paulo Motta
>         Attachments: cassandra-2.1.8-10233-v2.txt, 
> cassandra-2.1.8-10233-v3.txt
>
>
> After upgrading our cluster to 2.2.0, the following error started showing 
> exectly every 10 minutes on every server in the cluster:
> {noformat}
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,506 
> CompactionTask.java:142 - Compacting (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-540-big-Data.db:level=0,
>  ]
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,599 
> CompactionTask.java:224 - Compacted (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 1 
> sstables to 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-541-big,] 
> to level=0.  1,544,495 bytes to 1,544,495 (~100% of original) in 93ms = 
> 15.838121MB/s.  0 total partitions merged to 4.  Partition merge counts were 
> {1:4, }
> ERROR [HintedHandoff:1] 2015-08-31 18:31:55,600 CassandraDaemon.java:182 - 
> Exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.IndexOutOfBoundsException: null
>       at java.nio.Buffer.checkIndex(Buffer.java:538) ~[na:1.7.0_79]
>       at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:410) 
> ~[na:1.7.0_79]
>       at org.apache.cassandra.utils.UUIDGen.getUUID(UUIDGen.java:106) 
> ~[apache-cassandra-2.2.0.jar:2.2.0]
>       at 
> org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:515)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>       at 
> org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:88)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>       at 
> org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:168)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>       at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_79]
>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
> [na:1.7.0_79]
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  [na:1.7.0_79]
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  [na:1.7.0_79]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
>       at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10233) IndexOutOfBoundsException in HintedHandOffManager

Reply via email to