Of course QA decided to start a test batch (still relatively low traffic), so I hope it doesn't throw the tpstats off too much
Node 1: Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 13804928 0 0 ReadStage 0 0 10975 0 0 RequestResponseStage 0 0 7725378 0 0 ReadRepairStage 0 0 1247 0 0 ReplicateOnWriteStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 1 1 50 0 0 FlushWriter 0 0 306 0 31 MemoryMeter 0 0 719 0 0 GossipStage 0 0 286505 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CompactionExecutor 4 14 159 0 0 ValidationExecutor 0 0 0 0 0 MigrationStage 0 0 0 0 0 commitlog_archiver 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator 0 0 11 0 0 MemtablePostFlusher 0 0 1781 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 391041 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Node 2: Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 997042 0 0 ReadStage 0 0 2623 0 0 RequestResponseStage 0 0 706650 0 0 ReadRepairStage 0 0 275 0 0 ReplicateOnWriteStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 2 2 12 0 0 FlushWriter 0 0 37 0 4 MemoryMeter 0 0 70 0 0 GossipStage 0 0 14927 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CompactionExecutor 4 7 94 0 0 ValidationExecutor 0 0 0 0 0 MigrationStage 0 0 0 0 0 commitlog_archiver 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator 0 0 3 0 0 MemtablePostFlusher 0 0 114 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Node 3: Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 1539324 0 0 ReadStage 0 0 2571 0 0 RequestResponseStage 0 0 373300 0 0 ReadRepairStage 0 0 325 0 0 ReplicateOnWriteStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 1 1 21 0 0 FlushWriter 0 0 38 0 5 MemoryMeter 0 0 59 0 0 GossipStage 0 0 21491 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CompactionExecutor 4 9 85 0 0 ValidationExecutor 0 0 0 0 0 MigrationStage 0 0 0 0 0 commitlog_archiver 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator 0 0 6 0 0 MemtablePostFlusher 0 0 164 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 205259 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 18 Compaction seems like the only thing consistently active and pending On Tue, Dec 16, 2014 at 2:18 PM, Ryan Svihla <rsvi...@datastax.com> wrote: > > Ok based on those numbers I have a theory.. > > can you show me nodetool tptats for all 3 nodes? > > On Tue, Dec 16, 2014 at 4:04 PM, Arne Claassen <a...@emotient.com> wrote: >> >> No problem with the follow up questions. I'm on a crash course here >> trying to understand what makes C* tick so I appreciate all feedback. >> >> We reprocessed all media (1200 partition keys) last night where partition >> keys had somewhere between 4k and 200k "rows". After that completed, no >> traffic went to cluster at all for ~8 hours and throughout today, we may >> get a couple (less than 10) queries per second and maybe 3-4 write batches >> per hour. >> >> I assume the last value in the Partition Size histogram is the largest >> row: >> >> 20924300 bytes: 79 >> 25109160 bytes: 57 >> >> The majority seems clustered around 200000 bytes. >> >> I will look at switching my inserts to unlogged batches since they are >> always for one partition key. >> >> On Tue, Dec 16, 2014 at 1:47 PM, Ryan Svihla <rsvi...@datastax.com> >> wrote: >>> >>> Can you define what is "virtual no traffic" sorry to be repetitive about >>> that, but I've worked on a lot of clusters in the past year and people have >>> wildly different ideas what that means. >>> >>> unlogged batches of the same partition key are definitely a performance >>> optimization. Typically async is much faster and easier on the cluster when >>> you're using multip partition key batches. >>> >>> nodetool cfhistograms <keyspace> <tablename> >>> >>> On Tue, Dec 16, 2014 at 3:42 PM, Arne Claassen <a...@emotient.com> >>> wrote: >>>> >>>> Actually not sure why the machine was originally configured at 6GB >>>> since we even started it on an r3.large with 15GB. >>>> >>>> Re: Batches >>>> >>>> Not using batches. I actually have that as a separate question on the >>>> list. Currently I fan out async single inserts and I'm wondering if batches >>>> are better since my data is inherently inserted in blocks of ordered rows >>>> for a single partition key. >>>> >>>> >>>> Re: Traffic >>>> >>>> There isn't all that much traffic. Inserts come in as blocks per >>>> partition key, but then can be 5k-200k rows for that partition key. Each of >>>> these rows is less than 100k. It's small, lots of ordered rows. It's frame >>>> and sub-frame information for media. and rows for one piece of media is >>>> inserted at once (the partition key). >>>> >>>> For the last 12 hours, where the load on all these machine has been >>>> stuck there's been virtually no traffic at all. This is the nodes basically >>>> sitting idle, except that they had load of 4 each. >>>> >>>> BTW, how do you determine widest row or for that matter number of >>>> tombstones in a row? >>>> >>>> thanks, >>>> arne >>>> >>>> On Tue, Dec 16, 2014 at 1:24 PM, Ryan Svihla <rsvi...@datastax.com> >>>> wrote: >>>>> >>>>> So 1024 is still a good 2.5 times what I'm suggesting, 6GB is hardly >>>>> enough to run Cassandra well in, especially if you're going full bore on >>>>> loads. However, you maybe just flat out be CPU bound on your write >>>>> throughput, how many TPS and what size writes do you have? Also what is >>>>> your widest row? >>>>> >>>>> Final question what is compaction throughput at? >>>>> >>>>> >>>>> On Tue, Dec 16, 2014 at 3:20 PM, Arne Claassen <a...@emotient.com> >>>>> wrote: >>>>>> >>>>>> The starting configuration I had, which is still running on two of >>>>>> the nodes, was 6GB Heap, 1024MB parnew which is close to what you are >>>>>> suggesting and those have been pegged at load 4 for the over 12 hours >>>>>> with >>>>>> hardly and read or write traffic. I will set one to 8GB/400MB and see if >>>>>> its load changes. >>>>>> >>>>>> On Tue, Dec 16, 2014 at 1:12 PM, Ryan Svihla <rsvi...@datastax.com> >>>>>> wrote: >>>>>> >>>>>>> So heap of that size without some tuning will create a number of >>>>>>> problems (high cpu usage one of them), I suggest either 8GB heap and >>>>>>> 400mb >>>>>>> parnew (which I'd only set that low for that low cpu count) , or attempt >>>>>>> the tunings as indicated in >>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-8150 >>>>>>> >>>>>>> On Tue, Dec 16, 2014 at 3:06 PM, Arne Claassen <a...@emotient.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> Changed the 15GB node to 25GB heap and the nice CPU is down to ~20% >>>>>>>> now. Checked my dev cluster to see if the ParNew log entries are just >>>>>>>> par >>>>>>>> for the course, but not seeing them there. However, both have the >>>>>>>> following >>>>>>>> every 30 seconds: >>>>>>>> >>>>>>>> DEBUG [BatchlogTasks:1] 2014-12-16 21:00:44,898 >>>>>>>> BatchlogManager.java (line 165) Started replayAllFailedBatches >>>>>>>> DEBUG [MemtablePostFlusher:1] 2014-12-16 21:00:44,899 >>>>>>>> ColumnFamilyStore.java (line 866) forceFlush requested but everything >>>>>>>> is >>>>>>>> clean in batchlog >>>>>>>> DEBUG [BatchlogTasks:1] 2014-12-16 21:00:44,899 >>>>>>>> BatchlogManager.java (line 200) Finished replayAllFailedBatches >>>>>>>> >>>>>>>> Is that just routine scheduled house-keeping or a sign of something >>>>>>>> else? >>>>>>>> >>>>>>>> On Tue, Dec 16, 2014 at 12:52 PM, Arne Claassen <a...@emotient.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Sorry, I meant 15GB heap on the one machine that has less nice >>>>>>>>> CPU% now. The others are 6GB >>>>>>>>> >>>>>>>>> On Tue, Dec 16, 2014 at 12:50 PM, Arne Claassen <a...@emotient.com >>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>> AWS r3.xlarge, 30GB, but only using a Heap of 10GB, new 2GB >>>>>>>>>> because we might go c3.2xlarge instead if CPU is more important than >>>>>>>>>> RAM >>>>>>>>>> Storage is optimized EBS SSD (but iostat shows no real IO going >>>>>>>>>> on) >>>>>>>>>> Each node only has about 10GB with ownership of 67%, 64.7% & >>>>>>>>>> 68.3%. >>>>>>>>>> >>>>>>>>>> The node on which I set the Heap to 10GB from 6GB the >>>>>>>>>> utlilization has dropped to 46%nice now, but the ParNew log messages >>>>>>>>>> still >>>>>>>>>> continue at the same pace. I'm gonna up the HEAP to 20GB for a bit, >>>>>>>>>> see if >>>>>>>>>> that brings that nice CPU further down. >>>>>>>>>> >>>>>>>>>> No TombstoneOverflowingExceptions. >>>>>>>>>> >>>>>>>>>> On Tue, Dec 16, 2014 at 11:50 AM, Ryan Svihla < >>>>>>>>>> rsvi...@datastax.com> wrote: >>>>>>>>>>> >>>>>>>>>>> What's CPU, RAM, Storage layer, and data density per node? Exact >>>>>>>>>>> heap settings would be nice. In the logs look for >>>>>>>>>>> TombstoneOverflowingException >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 16, 2014 at 1:36 PM, Arne Claassen < >>>>>>>>>>> a...@emotient.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I'm running 2.0.10. >>>>>>>>>>>> >>>>>>>>>>>> The data is all time series data and as we change our pipeline, >>>>>>>>>>>> we've been periodically been reprocessing the data sources, which >>>>>>>>>>>> causes >>>>>>>>>>>> each time series to be overwritten, i.e. every row per partition >>>>>>>>>>>> key is >>>>>>>>>>>> deleted and re-written, so I assume i've been collecting a bunch of >>>>>>>>>>>> tombstones. >>>>>>>>>>>> >>>>>>>>>>>> Also, the presence of the ever present and never completing >>>>>>>>>>>> compaction types, i assumed were an artifact of tombstoning, but i >>>>>>>>>>>> fully >>>>>>>>>>>> admit to conjecture based on about ~20 blog posts and stackoverflow >>>>>>>>>>>> questions i've surveyed. >>>>>>>>>>>> >>>>>>>>>>>> I doubled the Heap on one node and it changed nothing regarding >>>>>>>>>>>> the load or the ParNew log statements. New Generation Usage is >>>>>>>>>>>> 50%, Eden >>>>>>>>>>>> itself is 56%. >>>>>>>>>>>> >>>>>>>>>>>> Anything else i should look at and report, let me know. >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Dec 16, 2014 at 11:14 AM, Jonathan Lacefield < >>>>>>>>>>>> jlacefi...@datastax.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> What version of Cassandra are you running? >>>>>>>>>>>>> >>>>>>>>>>>>> If it's 2.0, we recently experienced something similar with >>>>>>>>>>>>> 8447 [1], which 8485 [2] should hopefully resolve. >>>>>>>>>>>>> >>>>>>>>>>>>> Please note that 8447 is not related to tombstones. >>>>>>>>>>>>> Tombstone processing can put a lot of pressure on the heap as >>>>>>>>>>>>> well. Why do >>>>>>>>>>>>> you think you have a lot of tombstones in that one particular >>>>>>>>>>>>> table? >>>>>>>>>>>>> >>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-8447 >>>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/CASSANDRA-8485 >>>>>>>>>>>>> >>>>>>>>>>>>> Jonathan >>>>>>>>>>>>> >>>>>>>>>>>>> [image: datastax_logo.png] >>>>>>>>>>>>> >>>>>>>>>>>>> Jonathan Lacefield >>>>>>>>>>>>> >>>>>>>>>>>>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com >>>>>>>>>>>>> >>>>>>>>>>>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> >>>>>>>>>>>>> [image: >>>>>>>>>>>>> facebook.png] <https://www.facebook.com/datastax> [image: >>>>>>>>>>>>> twitter.png] <https://twitter.com/datastax> [image: g+.png] >>>>>>>>>>>>> <https://plus.google.com/+Datastax/about> >>>>>>>>>>>>> <http://feeds.feedburner.com/datastax> >>>>>>>>>>>>> <https://github.com/datastax/> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Dec 16, 2014 at 2:04 PM, Arne Claassen < >>>>>>>>>>>>> a...@emotient.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have a three node cluster that has been sitting at a load >>>>>>>>>>>>>> of 4 (for each node), 100% CPI utilization (although 92% nice) >>>>>>>>>>>>>> for that >>>>>>>>>>>>>> last 12 hours, ever since some significant writes finished. I'm >>>>>>>>>>>>>> trying to >>>>>>>>>>>>>> determine what tuning I should be doing to get it out of this >>>>>>>>>>>>>> state. The >>>>>>>>>>>>>> debug log is just an endless series of: >>>>>>>>>>>>>> >>>>>>>>>>>>>> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:35,042 >>>>>>>>>>>>>> GCInspector.java (line 118) GC for ParNew: 166 ms for 10 >>>>>>>>>>>>>> collections, >>>>>>>>>>>>>> 4400928736 used; max is 8000634880 >>>>>>>>>>>>>> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:36,043 >>>>>>>>>>>>>> GCInspector.java (line 118) GC for ParNew: 165 ms for 10 >>>>>>>>>>>>>> collections, >>>>>>>>>>>>>> 4440011176 used; max is 8000634880 >>>>>>>>>>>>>> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:37,043 >>>>>>>>>>>>>> GCInspector.java (line 118) GC for ParNew: 135 ms for 8 >>>>>>>>>>>>>> collections, >>>>>>>>>>>>>> 4402220568 used; max is 8000634880 >>>>>>>>>>>>>> >>>>>>>>>>>>>> iostat shows virtually no I/O. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Compaction may enter into this, but i don't really know what >>>>>>>>>>>>>> to make of compaction stats since they never change: >>>>>>>>>>>>>> >>>>>>>>>>>>>> [root@cassandra-37919c3a ~]# nodetool compactionstats >>>>>>>>>>>>>> pending tasks: 10 >>>>>>>>>>>>>> compaction type keyspace table >>>>>>>>>>>>>> completed total unit progress >>>>>>>>>>>>>> Compaction mediamedia_tracks_raw >>>>>>>>>>>>>> 271651482 563615497 bytes 48.20% >>>>>>>>>>>>>> Compaction mediamedia_tracks_raw >>>>>>>>>>>>>> 30308910 21676695677 bytes 0.14% >>>>>>>>>>>>>> Compaction mediamedia_tracks_raw >>>>>>>>>>>>>> 1198384080 1815603161 bytes 66.00% >>>>>>>>>>>>>> Active compaction remaining time : 0h22m24s >>>>>>>>>>>>>> >>>>>>>>>>>>>> 5 minutes later: >>>>>>>>>>>>>> >>>>>>>>>>>>>> [root@cassandra-37919c3a ~]# nodetool compactionstats >>>>>>>>>>>>>> pending tasks: 9 >>>>>>>>>>>>>> compaction type keyspace table >>>>>>>>>>>>>> completed total unit progress >>>>>>>>>>>>>> Compaction mediamedia_tracks_raw >>>>>>>>>>>>>> 271651482 563615497 bytes 48.20% >>>>>>>>>>>>>> Compaction mediamedia_tracks_raw >>>>>>>>>>>>>> 30308910 21676695677 bytes 0.14% >>>>>>>>>>>>>> Compaction mediamedia_tracks_raw >>>>>>>>>>>>>> 1198384080 1815603161 bytes 66.00% >>>>>>>>>>>>>> Active compaction remaining time : 0h22m24s >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sure the pending tasks went down by one, but the rest is >>>>>>>>>>>>>> identical. media_tracks_raw likely has a bunch of tombstones >>>>>>>>>>>>>> (can't figure >>>>>>>>>>>>>> out how to get stats on that). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is this behavior something that indicates that i need more >>>>>>>>>>>>>> Heap, larger new generation? Should I be manually running >>>>>>>>>>>>>> compaction on >>>>>>>>>>>>>> tables with lots of tombstones? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any suggestions or places to educate myself better on >>>>>>>>>>>>>> performance tuning would be appreciated. >>>>>>>>>>>>>> >>>>>>>>>>>>>> arne >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>>>>>>>>> >>>>>>>>>>> Ryan Svihla >>>>>>>>>>> >>>>>>>>>>> Solution Architect >>>>>>>>>>> >>>>>>>>>>> [image: twitter.png] <https://twitter.com/foundev> [image: >>>>>>>>>>> linkedin.png] >>>>>>>>>>> <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> >>>>>>>>>>> >>>>>>>>>>> DataStax is the fastest, most scalable distributed database >>>>>>>>>>> technology, delivering Apache Cassandra to the world’s most >>>>>>>>>>> innovative >>>>>>>>>>> enterprises. Datastax is built to be agile, always-on, and >>>>>>>>>>> predictably >>>>>>>>>>> scalable to any size. With more than 500 customers in 45 countries, >>>>>>>>>>> DataStax >>>>>>>>>>> is the database technology and transactional backbone of choice for >>>>>>>>>>> the >>>>>>>>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, >>>>>>>>>>> and eBay. >>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>>>>> >>>>>>> Ryan Svihla >>>>>>> >>>>>>> Solution Architect >>>>>>> >>>>>>> [image: twitter.png] <https://twitter.com/foundev> [image: >>>>>>> linkedin.png] <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> >>>>>>> >>>>>>> DataStax is the fastest, most scalable distributed database >>>>>>> technology, delivering Apache Cassandra to the world’s most innovative >>>>>>> enterprises. Datastax is built to be agile, always-on, and predictably >>>>>>> scalable to any size. With more than 500 customers in 45 countries, >>>>>>> DataStax >>>>>>> is the database technology and transactional backbone of choice for the >>>>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and >>>>>>> eBay. >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> >>>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>>> >>>>> Ryan Svihla >>>>> >>>>> Solution Architect >>>>> >>>>> [image: twitter.png] <https://twitter.com/foundev> [image: >>>>> linkedin.png] <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> >>>>> >>>>> DataStax is the fastest, most scalable distributed database >>>>> technology, delivering Apache Cassandra to the world’s most innovative >>>>> enterprises. Datastax is built to be agile, always-on, and predictably >>>>> scalable to any size. With more than 500 customers in 45 countries, >>>>> DataStax >>>>> is the database technology and transactional backbone of choice for the >>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>>> >>>>> >>> >>> -- >>> >>> [image: datastax_logo.png] <http://www.datastax.com/> >>> >>> Ryan Svihla >>> >>> Solution Architect >>> >>> [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png] >>> <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> >>> >>> DataStax is the fastest, most scalable distributed database technology, >>> delivering Apache Cassandra to the world’s most innovative enterprises. >>> Datastax is built to be agile, always-on, and predictably scalable to any >>> size. With more than 500 customers in 45 countries, DataStax is the >>> database technology and transactional backbone of choice for the worlds >>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>> >>> > > -- > > [image: datastax_logo.png] <http://www.datastax.com/> > > Ryan Svihla > > Solution Architect > > [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png] > <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> > > DataStax is the fastest, most scalable distributed database technology, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the > database technology and transactional backbone of choice for the worlds > most innovative companies such as Netflix, Adobe, Intuit, and eBay. > >