On Fri 05-01-18 08:54:56, jiang.bi...@zte.com.cn wrote:
> > On Mon 27-11-17 11:30:19, Jiang Biao wrote:
> > > When running ltp stress test for 7*24 hours, the vmscan occasionally
> > > complains the following warning continuously,
> >> 
> >>  mb_cache_scan+0x0/0x3f0 negative objects to delete
> >>  nr=-9232265467809300450
> >>  ...
> >> 
> >>  The tracing result shows the freeable(mb_cache_count returns)
> >>  is -1, which causes the continuous accumulation and overflow of
> >>  total_scan.
> >> 
> >>  This patch make sure the mb_cache_count not return negative value,
> >>  which make the mbcache shrinker more robust.
> >> 
> >>  Signed-off-by: Jiang Biao <jiang.bi...@zte.com.cn>
> > 
> > Going through some old email...
> > a) c_entry_count is unsigned so your patch is a nop as Coverity properly
> > noticed.
> Indeed, would the following casting be good?
> +    if (unlikely((int)(cache->c_entry_count) < 0))
> +        return 0;

That check would at least have a chance of hitting but still it is just
hiding the real problem.

> > b) c_entry_count being outside 0..2*cache->c_max_entries is a plain bug. I
> > went through the logic and cannot find out how that could happen though.
> Is there any possibility that decreasing c_entry_count from 0 to -1 
> in mb_cache_entry_delete?

If we think we have -1 entries in a list, we have a larger problem than
just the wrong behavior of the shrinker. This is just a plain counter of
entries protected by a spinlock so there isn't space for accounting errors
or anything like that. If you can reproduce the problem on some reasonably
recent kernel, I'd be interested in debugging this.

                                                                Honza

-- 
Jan Kara <j...@suse.com>
SUSE Labs, CR

Reply via email to