[ 
https://issues.apache.org/jira/browse/CASSANDRA-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765287#comment-15765287
 ] 

Randy Fradin commented on CASSANDRA-12965:
------------------------------------------

Understood on not fixing in 2.1- will still be nice to see that it's fixed for 
when we upgrade. Here's the info you asked for:

- This happened more than once. We had a data center's worth of nodes down for 
a long period of time (longer than the hinted handoff window) before this 
happened so I am assuming that caused more ranges to be out of sync than usual 
before this repair run. The tables were not particularly big (a few GB total at 
most) so it could not have been a large volume of data that needed be synced, 
but nevertheless it resulted in thousands of SSTables being created on the 
nodes that had been down for a set of tables that normally have ~20ish 
SSTables. After killing repair, running it again would yield the same result. 
We avoided running repair on those particular tables until we could figure out 
what to do. The large number of SSTables caused its own problems that we worked 
around, but separate from that we had this CPU problem resulting from all the 
streaming sessions that created the SSTables.
- We run full (non-incremental) repair with the -pr and -par options. Each run 
is always for a specific table.
- We have around 400 tables in this cluster with varying RFs, but the RF for 
the tables that were causing the issue is 3 per data center across 4 data 
centers. There are 24 nodes total in the cluster and each node has 256 vnodes.
- Yes we have our own repair coordinator that's currently configured to run up 
to 8 repairs at the same time across the cluster.

> StreamReceiveTask causing high CPU utilization during repair
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-12965
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12965
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Randy Fradin
>
> During a full repair run, I observed one node in my cluster using 100% cpu 
> (100% of all cores on a 48-core machine). When I took a stack trace I found 
> exactly 48 running StreamReceiveTask threads. Each was in the same block of 
> code in StreamReceiveTask.OnCompletionRunnable:
> {noformat}
> "StreamReceiveTask:8077" #1511134 daemon prio=5 os_prio=0 
> tid=0x00007f01520a8800 nid=0x6e77 runnable [0x00007f020dfae000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.ComparableTimSort.binarySort(ComparableTimSort.java:258)
>         at java.util.ComparableTimSort.sort(ComparableTimSort.java:203)
>         at java.util.Arrays.sort(Arrays.java:1312)
>         at java.util.Arrays.sort(Arrays.java:1506)
>         at java.util.ArrayList.sort(ArrayList.java:1454)
>         at java.util.Collections.sort(Collections.java:141)
>         at 
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:257)
>         at 
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>         at 
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>         at 
> org.apache.cassandra.db.DataTracker$SSTableIntervalTree.<init>(DataTracker.java:590)
>         at 
> org.apache.cassandra.db.DataTracker$SSTableIntervalTree.<init>(DataTracker.java:584)
>         at 
> org.apache.cassandra.db.DataTracker.buildIntervalTree(DataTracker.java:565)
>         at 
> org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:761)
>         at 
> org.apache.cassandra.db.DataTracker.addSSTablesToTracker(DataTracker.java:428)
>         at 
> org.apache.cassandra.db.DataTracker.addSSTables(DataTracker.java:283)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore.addSSTables(ColumnFamilyStore.java:1422)
>         at 
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:148)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> All 48 threads were in ColumnFamilyStore.addSSTables(), and specifically in 
> the IntervalNode constructor called from the IntervalTree constructor.
> It stayed this way for maybe an hour before we restarted the node. The repair 
> was also generating thousands (20,000+) of tiny SSTables in a table that 
> previously had just 20.
> I don't know enough about SSTables and ColumnFamilyStore to know if all this 
> CPU work is necessary or a bug, but I did notice that these tasks are run on 
> a thread pool constructed in StreamReceiveTask.java, so perhaps this pool 
> should have a thread count max less than the number of processors on the 
> machine, at least for machines with a lot of processors. Any reason not to do 
> that? Any ideas for a reasonable # or formula to cap the thread count?
> Some additional info: We have never run incremental repair on this cluster, 
> so that is not a factor. All our tables use LCS. Unfortunately I don't have 
> the log files from the period saved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to