Re: UMA cache back pressure

Jeff Roberson Mon, 18 Nov 2013 11:15:44 -0800

On Mon, 18 Nov 2013, Alexander Motin wrote:

Hi.
I've created patch, based on earlier work of avg@, to add back pressure toUMA allocation caches. The problem of physical memory or KVA exhaustionexisted there for many years and it is quite critical now for improvingsystems performance while keeping stability. Changes done in memoryallocation last years improved situation. but haven't fixed completely. Mypatch solves remaining problems from two sides: a) reducing bucket sizesevery time system detects low memory condition; and b) as last-resortmechanism for very low memory condition, it cycling over all CPUs to purgetheir per-CPU UMA caches. Benefit of this approach is in absence of anyadditional hard-coded limits on cache sizes -- they are self-tuned, based onload and memory pressure.
With this change I believe it should be safe enough to enable UMA allocationcaches in ZFS via vfs.zfs.zio.use_uma tunable (at least for amd64). I didmany tests on machine with 24 logical cores (and as result strong allocationcache effects), and can say that with 40GB RAM using UMA caches, allowed bythis change, by two times increases results of SPEC NFS benchmark on ZFS poolof several SSDs. To test system stability I've run the same test withphysical memory limited to just 2GB and system successfully survived that,and even showed results 1.5 times better then with just last resort measuresof b). In both cases tools/umastat no longer shows unbound UMA cache growth,that makes me believe in viability of this approach for longer runs.
I would like to hear some comments about that:
http://people.freebsd.org/~mav/uma_pressure.patch


Hey Mav,

This is a great start and great results. I think it could probably evengo in as-is, but I have a few suggestions.

First, let's test this with something that is really super allocator heavyand doesn't benefit much from bucket sizing. For example, a networkforwarding test. Or maybe you could get someone like Netflix that isusing it to push a lot of bits with less filesystem cost than zfs andspec.

Second, the cpu binding is a very costly and very high-latency operation.It would make sense to do CPU_FOREACH and then ZONE_FOREACH. You're alsobiasing the first zones in the list. The low memory condition will moreoften clear after you check these first zones. So you might just check itonce and equally penalize all zones. I'm concerned that doing CPU_FOREACHin every zone will slow the pagedaemon more. We also have been workingtowards per-domain pagedaemons so perhaps we should have a uma-reclaimtaskqueue that we wake up to do the work?

Third, using vm_page_count_min() will only trigger when the pageout daemoncan't keep up with the free target. Typically this should only happenwith a lot of dirty mmap'd pages or incredibly high system load coupledwith frequent allocations. So there may be many cases where reclaimingthe extra UMA memory is helpful but the pagedaemon can still keep up whilepushing out file pages that we'd prefer to keep.

I think the perfect heuristic would have some idea of how likely the UMApages are to be re-used immediately so we can more effectively tradeoffbetween file pages and kernel memory cache. As it is now we limit theuma_reclaim() calls to every 10 seconds when there is memory pressure.Perhaps we could keep a timestamp for when the last slab was allocated toa zone and do the more expensive reclaim on zones who have timestamps thatexceed some threshold? Then have a lower threshold for reclaiming at all?Again, it doesn't need to be perfect, but I believe we can catch a widerset of cases by carefully scheduling this.


Thanks,
Jeff


Thank you.

--
Alexander Motin

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: UMA cache back pressure

Reply via email to