Re: [PATCH] fs/mbcache: make count_objects more robust.

Jan Kara Mon, 08 Jan 2018 01:22:24 -0800

On Fri 05-01-18 08:54:56, [email protected] wrote:
> > On Mon 27-11-17 11:30:19, Jiang Biao wrote:
> > > When running ltp stress test for 7*24 hours, the vmscan occasionally
> > > complains the following warning continuously,
> >> 
> >>  mb_cache_scan+0x0/0x3f0 negative objects to delete
> >>  nr=-9232265467809300450
> >>  ...
> >> 
> >>  The tracing result shows the freeable(mb_cache_count returns)
> >>  is -1, which causes the continuous accumulation and overflow of
> >>  total_scan.
> >> 
> >>  This patch make sure the mb_cache_count not return negative value,
> >>  which make the mbcache shrinker more robust.
> >> 
> >>  Signed-off-by: Jiang Biao <[email protected]>
> > 
> > Going through some old email...
> > a) c_entry_count is unsigned so your patch is a nop as Coverity properly
> > noticed.
> Indeed, would the following casting be good?
> +    if (unlikely((int)(cache->c_entry_count) < 0))
> +        return 0;


That check would at least have a chance of hitting but still it is just
hiding the real problem.

> > b) c_entry_count being outside 0..2*cache->c_max_entries is a plain bug. I
> > went through the logic and cannot find out how that could happen though.
> Is there any possibility that decreasing c_entry_count from 0 to -1 
> in mb_cache_entry_delete?

If we think we have -1 entries in a list, we have a larger problem than
just the wrong behavior of the shrinker. This is just a plain counter of
entries protected by a spinlock so there isn't space for accounting errors
or anything like that. If you can reproduce the problem on some reasonably
recent kernel, I'd be interested in debugging this.

                                                                Honza

-- 
Jan Kara <[email protected]>
SUSE Labs, CR

Re: [PATCH] fs/mbcache: make count_objects more robust.

Reply via email to