What I’ve seen happen a number of times is you get in a negative feedback
loop:
not enough capacity to keep up with compactions (often triggered by repair
or compaction hitting a large partition) -> more sstables -> more expensive
reads -> even less capacity to keep up with compactions -> repeat

The way we deal with this at Instaclustr is typically to take the node
offline to let it catch up with compactions. We take it offline by running
nodetool disablegossip + disablethrift + disablebinary, unthrottle
compactions (nodetool setcompactionthroughput 0) and then leave it to chug
through compactions until it gets close to zero then reverse the settings
or restart C* to set things back to normal. This typically resolves the
issues. If you see it happening regularly your cluster probably needs more
processing capacity (or other tuning).

Cheers
Ben

On Tue, 8 Nov 2016 at 02:38 Eiti Kimura <eiti.kim...@movile.com> wrote:

> Hey guys,
>
> Do we have any conclusions about this case? Ezra, did you solve your
> problem?
> We are facing a very similar problem here. LeveledCompaction with VNodes
> and looks like a node went to a weird state and start to consume lot of
> CPU, the compaction process seems to be stucked and the number of SSTables
> increased significantly.
>
> Do you have any clue about it?
>
> Thanks,
> Eiti
>
>
>
> J.P. Eiti Kimura
> Plataformas
>
> +55 19 3518  <https://www.movile.com/assinaturaemail/#>5500
> + <https://www.movile.com/assinaturaemail/#>55 19 98232 2792
> skype: eitikimura
> <https://www.linkedin.com/company/movile>
> <https://pt.pinterest.com/Movile/>  <https://twitter.com/movile_LATAM>
> <https://www.facebook.com/Movile>
>
> 2016-09-11 18:20 GMT-03:00 Jens Rantil <jens.ran...@tink.se>:
>
> I just want to chime in and say that we also had issues keeping up with
> compaction once (with vnodes/ssd disks) and I also want to recommend
> keeping track of your open file limit which might bite you.
>
> Cheers,
> Jens
>
>
> On Friday, August 19, 2016, Mark Rose <markr...@markrose.ca> wrote:
>
> Hi Ezra,
>
> Are you making frequent changes to your rows (including TTL'ed
> values), or mostly inserting new ones? If you're only inserting new
> data, it's probable using size-tiered compaction would work better for
> you. If you are TTL'ing whole rows, consider date-tiered.
>
> If leveled compaction is still the best strategy, one way to catch up
> with compactions is to have less data per partition -- in other words,
> use more machines. Leveled compaction is CPU expensive. You are CPU
> bottlenecked currently, or from the other perspective, you have too
> much data per node for leveled compaction.
>
> At this point, compaction is so far behind that you'll likely be
> getting high latency if you're reading old rows (since dozens to
> hundreds of uncompacted sstables will likely need to be checked for
> matching rows). You may be better off with size tiered compaction,
> even if it will mean always reading several sstables per read (higher
> latency than when leveled can keep up).
>
> How much data do you have per node? Do you update/insert to/delete
> rows? Do you TTL?
>
> Cheers,
> Mark
>
> On Wed, Aug 17, 2016 at 2:39 PM, Ezra Stuetzel <ezra.stuet...@riskiq.net>
> wrote:
> > I have one node in my cluster 2.2.7 (just upgraded from 2.2.6 hoping to
> fix
> > issue) which seems to be stuck in a weird state -- with a large number of
> > pending compactions and sstables. The node is compacting about 500gb/day,
> > number of pending compactions is going up at about 50/day. It is at about
> > 2300 pending compactions now. I have tried increasing number of
> compaction
> > threads and the compaction throughput, which doesn't seem to help
> eliminate
> > the many pending compactions.
> >
> > I have tried running 'nodetool cleanup' and 'nodetool compact'. The
> latter
> > has fixed the issue in the past, but most recently I was getting OOM
> errors,
> > probably due to the large number of sstables. I upgraded to 2.2.7 and am
> no
> > longer getting OOM errors, but also it does not resolve the issue. I do
> see
> > this message in the logs:
> >
> >> INFO  [RMI TCP Connection(611)-10.9.2.218] 2016-08-17 01:50:01,985
> >> CompactionManager.java:610 - Cannot perform a full major compaction as
> >> repaired and unrepaired sstables cannot be compacted together. These
> two set
> >> of sstables will be compacted separately.
> >
> > Below are the 'nodetool tablestats' comparing a normal and the
> problematic
> > node. You can see problematic node has many many more sstables, and they
> are
> > all in level 1. What is the best way to fix this? Can I just delete those
> > sstables somehow then run a repair?
> >>
> >> Normal node
> >>>
> >>> keyspace: mykeyspace
> >>>
> >>>     Read Count: 0
> >>>
> >>>     Read Latency: NaN ms.
> >>>
> >>>     Write Count: 31905656
> >>>
> >>>     Write Latency: 0.051713177939359714 ms.
> >>>
> >>>     Pending Flushes: 0
> >>>
> >>>         Table: mytable
> >>>
> >>>         SSTable count: 1908
> >>>
> >>>         SSTables in each level: [11/4, 20/10, 213/100, 1356/1000, 306,
> 0,
> >>> 0, 0, 0]
> >>>
> >>>         Space used (live): 301894591442
> >>>
> >>>         Space used (total): 301894591442
> >>>
> >>>
> >>>
> >>> Problematic node
> >>>
> >>> Keyspace: mykeyspace
> >>>
> >>>     Read Count: 0
> >>>
> >>>     Read Latency: NaN ms.
> >>>
> >>>     Write Count: 30520190
> >>>
> >>>     Write Latency: 0.05171286705620116 ms.
> >>>
> >>>     Pending Flushes: 0
> >>>
> >>>         Table: mytable
> >>>
> >>>         SSTable count: 14105
> >>>
> >>>         SSTables in each level: [13039/4, 21/10, 206/100, 831, 0, 0, 0,
> >>> 0, 0]
> >>>
> >>>         Space used (live): 561143255289
> >>>
> >>>         Space used (total): 561143255289
> >
> > Thanks,
> >
> > Ezra
>
>
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook <https://www.facebook.com/#!/tink.se> Linkedin
> <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
>  Twitter <https://twitter.com/tink>
>
>
>

Reply via email to