[ 
https://issues.apache.org/jira/browse/CASSANDRA-19427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820096#comment-17820096
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19427 at 2/23/24 3:42 PM:
------------------------------------------------------------------------

Would be great if we covered all branches this occurs in. It seems to me like 
the change will be rather small so merging it from 3.11 (at least) up to trunk 
should not be a lot of work. 


was (Author: smiklosovic):
Would be great if we covered all branches this occurs in. It seems to me like 
the change will rather small so merging it from 3.11 (at least) up to trunk 
should not be a lot of work. 

> Fix concurrent access of ClientWarn causing AIOBE for SELECT WHERE IN queries 
> with multiple coordinator-local partitions
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19427
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19427
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination, Legacy/Local Write-Read Paths
>            Reporter: Abe Ratnofsky
>            Assignee: Abe Ratnofsky
>            Priority: Normal
>             Fix For: 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> On one of our clusters, we noticed rare but periodic 
> ArrayIndexOutOfBoundsExceptions:
>  
> {code:java}
> message="Uncaught exception on thread Thread[ReadStage-3,5,main]"
> exception="java.lang.RuntimeException: 
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2579)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.ArrayIndexOutOfBoundsException"{code}
>  
>  
> The error was in a Runnable, so the stacktrace didn't directly indicate where 
> the error was coming from. We enabled JFR to log the underlying exception 
> that was thrown:
>  
> {code:java}
> message="Uncaught exception on thread Thread[ReadStage-2,5,main]" 
> exception="java.lang.RuntimeException: 
> java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 0
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2579)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds 
> for length 0
> at java.base/java.util.ArrayList.add(ArrayList.java:487)
> at java.base/java.util.ArrayList.add(ArrayList.java:499)
> at org.apache.cassandra.service.ClientWarn$State.add(ClientWarn.java:84)
> at 
> org.apache.cassandra.service.ClientWarn$State.access$000(ClientWarn.java:77)
> at org.apache.cassandra.service.ClientWarn.warn(ClientWarn.java:51)
> at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.onClose(ReadCommand.java:596)
> at 
> org.apache.cassandra.db.transform.BasePartitions.runOnClose(BasePartitions.java:70)
> at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:95)
> at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2260)
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2575)
> ... 6 more"{code}
>  
>  
> An AIOBE on ArrayList.add(E) should only be possible when multiple threads 
> attempt to call the method at the same time.
>  
> This was seen while executing a SELECT WHERE IN query with multiple partition 
> keys. This exception could happen when multiple local reads are dispatched by 
> the coordinator in 
> org.apache.cassandra.service.reads.AbstractReadExecutor#makeRequests. In this 
> case, multiple local reads exceed the tombstone warning threshold, so 
> multiple tombstone warnings are added to the same ClientWarn.State reference. 
>  Currently, org.apache.cassandra.service.ClientWarn.State#warnings is an 
> ArrayList, which isn't safe for concurrent modification, causing the AIOBE to 
> be thrown.
>  
> I have a patch available for this, and I'm preparing it now. The patch is 
> simple - it just changes 
> org.apache.cassandra.service.ClientWarn.State#warnings to a thread-safe 
> CopyOnWriteArrayList. I also have a jvm-dtest that demonstrates the issue but 
> doesn't need to be merged - it shows how a SELECT WHERE IN query with local 
> reads that add client warnings can add to the same ClientWarn.State from 
> different threads. I'll push that in a separate branch just for demonstration 
> purposes.
>  
> This appears to have been an issue since at least 3.11, that was the earliest 
> release I checked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to