I didn't see the docker files in the repo that could build the docker
image, and when I tried cloning the git repo and doing a docker build I
encountered errors that I think were related to the web proxy on my work
network. I was able to grab the release tarball and the bitnami docker
file, do a little surgery to work around my proxy issue, and build a 1.6.17
docker image though.
I ran my application against the new version and it ran for ~2hr without
any errors (it previously wouldn't run more than 30s or so before
encountering blocks of the OOM during read errors). I also made a little
test loop that just hammered the instance with similar sized writes (1-2MB)
as fast as it could and let it run a few hours, and it didn't have a single
blip. That encompassed a couple million evictions. I'm pretty comfortable
saying the issue is fixed, at least for the kind of use I had in mind.
I added a comment to the issue on GitHub to the same effect.
I'm impressed by the quick turnaround, BTW. ;-)
H
On Friday, August 26, 2022 at 5:54:26 PM UTC-7 Dormando wrote:
> So I tested this a bit more and released it in 1.6.17; I think bitnami
> should pick it up soonish. if not I'll try to figure out docker this
> weekend if you still need it.
>
> I'm not 100% sure it'll fix your use case but it does fix some things I
> can test and it didn't seem like a regression. would be nice to validate
> still.
>
> On Fri, 26 Aug 2022, dormando wrote:
>
> > You can't build docker images or compile binaries? there's a
> > docker-compose.yml in the repo already if that helps.
> >
> > If not I can try but I don't spend a lot of time with docker directly.
> >
> > On Fri, 26 Aug 2022, Hayden wrote:
> >
> > > I'd be happy to help validate the fix, but I can't do it until the
> weekend, and I don't have a ready way to build an updated image. Any chance
> you could
> > > create a docker image with the fix that I could grab from somewhere?
> > >
> > > On Friday, August 26, 2022 at 10:38:54 AM UTC-7 Dormando wrote:
> > > I have an opportunity to put this fix into a release today if anyone
> wants
> > > to help validate :)
> > >
> > > On Thu, 25 Aug 2022, dormando wrote:
> > >
> > > > Took another quick look...
> > > >
> > > > Think there's an easy patch that might work:
> > > > https://github.com/memcached/memcached/pull/924
> > > >
> > > > If you wouldn't mind helping validate? An external validator would
> help me
> > > > get it in time for the next release :)
> > > >
> > > > Thanks,
> > > > -Dormando
> > > >
> > > > On Wed, 24 Aug 2022, dormando wrote:
> > > >
> > > > > Hey,
> > > > >
> > > > > Thanks for the info. Yes; this generally confirms the issue. I see
> some of
> > > > > your higher slab classes with "free_chunks 0", so if you're
> setting data
> > > > > that requires these chunks it could error out. The "stats items"
> confirms
> > > > > this since there are no actual items in those lower slab classes.
> > > > >
> > > > > You're certainly right a workaround of making your items < 512k
> would also
> > > > > work; but in general if I have features it'd be nice if they
> worked well
> > > > > :) Please open an issue so we can improve things!
> > > > >
> > > > > I intended to lower the slab_chunk_max default from 512k to much
> lower, as
> > > > > that actually raises the memory efficiency by a bit (less gap at
> the
> > > > > higher classes). That may help here. The system should also try
> ejecting
> > > > > items from the highest LRU... I need to double check that it wasn't
> > > > > already intending to do that and failing.
> > > > >
> > > > > Might also be able to adjust the page mover but not sure. The page
> mover
> > > > > can probably be adjusted to attempt to keep one page in reserve,
> but I
> > > > > think the algorithm isn't expecting slabs with no items in it so
> I'd have
> > > > > to audit that too.
> > > > >
> > > > > If you're up for experiments it'd be interesting to know if setting
> > > > > "-o slab_chunk_max=32768" or 16k (probably not more than 64) makes
> things
> > > > > better or worse.
> > > > >
> > > > > Also, crud.. it's documented as kilobytes but that's not working
> somehow?
> > > > > aaahahah. I guess the big EXPERIMENTAL tag scared people off since
> that
> > > > > never got reported.
> > > > >
> > > > > I'm guessing most people have a mix of small to large items, but
> you only
> > > > > have large items and a relatively low memory limit, so this is why
> you're
> > > > > seeing it so easily. I think most people setting large items have
> like
> > > > > 30G+ of memory so you end up with more spread around.
> > > > >
> > > > > Thanks,
> > > > > -Dormando
> > > > >
> > > > > On Wed, 24 Aug 2022, Hayden wrote:
> > > > >
> > > > > > What you're saying makes sense, and I'm pretty sure it won't be
> too hard to add some functionality to my writing code to break my large
> > > items up into
> > > > > > smaller parts that can each fit into a single chunk. That has
> the