Hi, I spot checked a couple of the files that were ~200MB and the mostly had "Repaired at: 0" so maybe that's not it?
-B On Tue, Aug 7, 2018 at 8:16 PM <brian.spind...@gmail.com> wrote: > Everything is ttl’d > > I suppose I could use sstablemeta to see the repaired bit, could I just > set that to unrepaired somehow and that would fix? > > Thanks! > > On Aug 7, 2018, at 8:12 PM, Jeff Jirsa <jji...@gmail.com> wrote: > > May be worth seeing if any of the sstables got promoted to repaired - if > so they’re not eligible for compaction with unrepaired sstables and that > could explain some higher counts > > Do you actually do deletes or is everything ttl’d? > > > -- > Jeff Jirsa > > > On Aug 7, 2018, at 5:09 PM, Brian Spindler <brian.spind...@gmail.com> > wrote: > > Hi Jeff, mostly lots of little files, like there will be 4-5 that are > 1-1.5gb or so and then many at 5-50MB and many at 40-50MB each. > > Re incremental repair; Yes one of my engineers started an incremental > repair on this column family that we had to abort. In fact, the node that > the repair was initiated on ran out of disk space and we ended replacing > that node like a dead node. > > Oddly the new node is experiencing this issue as well. > > -B > > > On Tue, Aug 7, 2018 at 8:04 PM Jeff Jirsa <jji...@gmail.com> wrote: > >> You could toggle off the tombstone compaction to see if that helps, but >> that should be lower priority than normal compactions >> >> Are the lots-of-little-files from memtable flushes or >> repair/anticompaction? >> >> Do you do normal deletes? Did you try to run Incremental repair? >> >> -- >> Jeff Jirsa >> >> >> On Aug 7, 2018, at 5:00 PM, Brian Spindler <brian.spind...@gmail.com> >> wrote: >> >> Hi Jonathan, both I believe. >> >> The window size is 1 day, full settings: >> AND compaction = {'timestamp_resolution': 'MILLISECONDS', >> 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1', >> 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': '86400', >> 'tombstone_threshold': '0.2', 'class': >> 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'} >> >> >> nodetool tpstats >> >> Pool Name Active Pending Completed Blocked >> All time blocked >> MutationStage 0 0 68582241832 0 >> 0 >> ReadStage 0 0 209566303 0 >> 0 >> RequestResponseStage 0 0 44680860850 0 >> 0 >> ReadRepairStage 0 0 24562722 0 >> 0 >> CounterMutationStage 0 0 0 0 >> 0 >> MiscStage 0 0 0 0 >> 0 >> HintedHandoff 1 1 203 0 >> 0 >> GossipStage 0 0 8471784 0 >> 0 >> CacheCleanupExecutor 0 0 122 0 >> 0 >> InternalResponseStage 0 0 552125 0 >> 0 >> CommitLogArchiver 0 0 0 0 >> 0 >> CompactionExecutor 8 42 1433715 0 >> 0 >> ValidationExecutor 0 0 2521 0 >> 0 >> MigrationStage 0 0 527549 0 >> 0 >> AntiEntropyStage 0 0 7697 0 >> 0 >> PendingRangeCalculator 0 0 17 0 >> 0 >> Sampler 0 0 0 0 >> 0 >> MemtableFlushWriter 0 0 116966 0 >> 0 >> MemtablePostFlush 0 0 209103 0 >> 0 >> MemtableReclaimMemory 0 0 116966 0 >> 0 >> Native-Transport-Requests 1 0 1715937778 0 >> 176262 >> >> Message type Dropped >> READ 2 >> RANGE_SLICE 0 >> _TRACE 0 >> MUTATION 4390 >> COUNTER_MUTATION 0 >> BINARY 0 >> REQUEST_RESPONSE 1882 >> PAGED_RANGE 0 >> READ_REPAIR 0 >> >> >> On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad <j...@jonhaddad.com> wrote: >> >>> What's your window size? >>> >>> When you say backed up, how are you measuring that? Are there pending >>> tasks or do you just see more files than you expect? >>> >>> On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler <brian.spind...@gmail.com> >>> wrote: >>> >>>> Hey guys, quick question: >>>> >>>> I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log >>>> on one drive, data on nvme. That was working very well, it's a ts db and >>>> has been accumulating data for about 4weeks. >>>> >>>> The nodes have increased in load and compaction seems to be falling >>>> behind. I used to get about 1 file per day for this column family, about >>>> ~30GB Data.db file per day. I am now getting hundreds per day at 1mb - >>>> 50mb. >>>> >>>> How to recover from this? >>>> >>>> I can scale out to give some breathing room but will it go back and >>>> compact the old days into nicely packed files for the day? >>>> >>>> I tried setting compaction throughput to 1000 from 256 and it seemed to >>>> make things worse for the CPU, it's configured on i3.2xl with 8 compaction >>>> threads. >>>> >>>> -B >>>> >>>> Lastly, I have mixed TTLs in this CF and need to run a repair (I think) >>>> to get rid of old tombstones, however running repairs in 2.1 on TWCS column >>>> families causes a very large spike in sstable counts due to anti-compaction >>>> which causes a lot of disruption, is there any other way? >>>> >>>> >>>> >>> >>> -- >>> Jon Haddad >>> http://www.rustyrazorblade.com >>> twitter: rustyrazorblade >>> >>