Odd. Your "post-GC" heap level seems a lot lower than your max, which implies that you should be OK with ~10GB. I'm guessing either you're genuinely getting a huge surge in needed heap and running out, or it's falling behind and garbage is building up. If the latter, there might be some tweaking you can do. Probably worth turning on GC logging and digging through exactly what's happening.
CMS is kind of hard to tune and can have problems with heap fragmentation since it doesn't compact, but if it's working for you I'd say stick with it. On Thu, Jun 28, 2018 at 3:14 PM, Randy Lynn <rl...@getavail.com> wrote: > Thanks for the feedback.. > > Getting tons of OOM lately.. > > You mentioned overprovisioned heap size... well... > tried 8GB = OOM > tried 12GB = OOM > tried 20GB w/ G1 = OOM (and long GC pauses usually over 2 secs) > tried 20GB w/ CMS = running > > we're java 8 update 151. > 3.11.1. > > We've got one table that's got a 400MB partition.. that's the max.. the > 99th is < 100MB, and 95th < 30MB.. > So I'm not sure that I'm overprovisioned, I'm just not quite yet to the > heap size based on our partition sizes. > All queries use cluster key, so I'm not accidentally reading a whole > partition. > The last place I'm looking - which maybe should be the first - is > tombstones. > > sorry for the afternoon rant! thanks for your eyes! > > On Thu, Jun 28, 2018 at 5:54 PM, Elliott Sims <elli...@backblaze.com> > wrote: > >> It depends a bit on which collector you're using, but fairly normal. >> Heap grows for a while, then the JVM decides via a variety of metrics that >> it's time to run a collection. G1GC is usually a bit steadier and less >> sawtooth than the Parallel Mark Sweep , but if your heap's a lot bigger >> than needed I could see it producing that pattern. >> >> On Thu, Jun 28, 2018 at 9:23 AM, Randy Lynn <rl...@getavail.com> wrote: >> >>> I have datadog monitoring JVM heap. >>> >>> Running 3.11.1. >>> 20GB heap >>> G1 for GC.. all the G1GC settings are out-of-the-box >>> >>> Does this look normal? >>> >>> https://drive.google.com/file/d/1hLMbG53DWv5zNKSY88BmI3Wd0ic >>> _KQ07/view?usp=sharing >>> >>> I'm a C# .NET guy, so I have no idea if this is normal Java behavior. >>> >>> >>> >>> -- >>> Randy Lynn >>> rl...@getavail.com >>> >>> office: >>> 859.963.1616 <+1-859-963-1616> ext 202 >>> 163 East Main Street - Lexington, KY 40507 - USA >>> <https://maps.google.com/?q=163+East+Main+Street+-+Lexington,+KY+40507+-+USA&entry=gmail&source=g> >>> >>> <https://www.getavail.com/> getavail.com <https://www.getavail.com/> >>> >> >> > > > -- > Randy Lynn > rl...@getavail.com > > office: > 859.963.1616 <+1-859-963-1616> ext 202 > 163 East Main Street - Lexington, KY 40507 - USA > <https://maps.google.com/?q=163+East+Main+Street+-+Lexington,+KY+40507+-+USA&entry=gmail&source=g> > > <https://www.getavail.com/> getavail.com <https://www.getavail.com/> >