Re: pool page colouring

David Gwynne Tue, 04 Nov 2014 16:43:47 -0800

> On 5 Nov 2014, at 10:27, David Gwynne <da...@gwynne.id.au> wrote:
> 
>> 
>> On 5 Nov 2014, at 10:12, Mike Belopuhov <m...@belopuhov.com> wrote:
>> 
>> On 5 November 2014 00:38, David Gwynne <da...@gwynne.id.au> wrote:
>>> 
>>>> On 30 Oct 2014, at 07:52, Ted Unangst <t...@tedunangst.com> wrote:
>>>> 
>>>> On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:
>>>> 
>>>> 
>>>>> i dunno. im fine with either removing colouring altogether or setting it
>>>>> from something else completely. i just want a decision to be made cos
>>>>> right now ph_color isnt set, which is a bug.
>>>> 
>>>> there. i fixed it.
>>> 
>>> looks like we were both ignorant and wrong. mikeb@ points out this from the 
>>> original slab paper:
>>> 
>>> 4.1. Impact of Buffer Address Distribution on Cache
>>> Utilization
>>> 
>>> The address distribution of mid-size buffers can
>>> affect the system’s overall cache utilization. In par-
>>> ticular, power-of-two allocators - where all buffers
>>> are 2 n bytes and are 2 n -byte aligned - are pes-
>>> simal.* Suppose, for example, that every inode
>>> (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
>>> aligned, and that only the first dozen fields of an
>>> inode (48 bytes) are frequently referenced. Then
>>> the majority of inode-related memory traffic will be
>>> at addresses between 0 and 47 modulo 512. Thus
>>> the cache lines near 512-byte boundaries will be
>>> heavily loaded while the rest lie fallow. In effect
>>> only 9% (48/512) of the cache will be usable by
>>> inodes. Fully-associative caches would not suffer
>>> this problem, but current hardware trends are toward
>>> simpler rather than more complex caches.
>>> 
>>> 4.3. Slab Coloring
>>> 
>>> The slab allocator incorporates a simple coloring
>>> scheme that distributes buffers evenly throughout
>>> the cache, resulting in excellent cache utilization
>>> and bus balance. The concept is simple: each time
>>> a new slab is created, the buffer addresses start at a
>>> slightly different offset (color) from the slab base
>>> (which is always page-aligned). For example, for a
>>> cache of 200-byte objects with 8-byte alignment, the
>>> first slab’s buffers would be at addresses 0, 200,
>>> 400, ... relative to the slab base. The next slab’s
>>> buffers would be at offsets 8, 208, 408, ... and so
>>> on. The maximum slab color is determined by the
>>> amount of unused space in the slab.
>>> 
>>> 
>>> we run on enough different machines that i think we should consider this.
>>> 
>> 
>> well, first of all, right now this is a rather theoretical gain.  we
>> need to test it
>> to understand if it makes things easier.  to see cache statistics we can use
>> performance counters, however current pctr code might be a bit out of date.
> 
> pctr is x86 specific though. how would you measure on all the other archs?


i would argue that page colouring was in the code before, so it should be now 
unless it can be proven useless. the cost of putting it back in terms of code 
is minimal, the only question has been how do we pick the colour without 
holding the pools mutex?

> 
>> 
>>> so the question is if we do bring colouring back, how do we calculate it?
>>> arc4random? mask bits off ph_magic? atomic_inc something in the pool?
>>> read a counter from the pool? shift bits off the page address?
>> 
>> the way i read it is that you have a per-pool running value pr_color that you
>> increment by the item alignment or native cache line size modulo space
>> available for every page you are getting from uvm.  however i can see that
>> it might entail a problem by locating a page header (or was it page boundary?
>> don't have the code at hand) using simple math.
> 
> the stuff that finds a page header for a page doesnt care about the address 
> of individual items within a page, and colouring doesnt change an item being 
> wholly contained within a page. ive run with arc4random_uniform coloured 
> addresses for a couple of weeks now without problems of that nature.

Re: pool page colouring

Reply via email to