Re: pool page colouring

2014-11-04 Thread David Gwynne

 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:
 
 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:
 
 
 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.
 
 there. i fixed it.

looks like we were both ignorant and wrong. mikeb@ points out this from the 
original slab paper:

4.1. Impact of Buffer Address Distribution on Cache
Utilization

The address distribution of mid-size buffers can
affect the system’s overall cache utilization. In par-
ticular, power-of-two allocators - where all buffers
are 2 n bytes and are 2 n -byte aligned - are pes-
simal.* Suppose, for example, that every inode
(∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
aligned, and that only the first dozen fields of an
inode (48 bytes) are frequently referenced. Then
the majority of inode-related memory traffic will be
at addresses between 0 and 47 modulo 512. Thus
the cache lines near 512-byte boundaries will be
heavily loaded while the rest lie fallow. In effect
only 9% (48/512) of the cache will be usable by
inodes. Fully-associative caches would not suffer
this problem, but current hardware trends are toward
simpler rather than more complex caches.

4.3. Slab Coloring

The slab allocator incorporates a simple coloring
scheme that distributes buffers evenly throughout
the cache, resulting in excellent cache utilization
and bus balance. The concept is simple: each time
a new slab is created, the buffer addresses start at a
slightly different offset (color) from the slab base
(which is always page-aligned). For example, for a
cache of 200-byte objects with 8-byte alignment, the
first slab’s buffers would be at addresses 0, 200,
400, ... relative to the slab base. The next slab’s
buffers would be at offsets 8, 208, 408, ... and so
on. The maximum slab color is determined by the
amount of unused space in the slab.


we run on enough different machines that i think we should consider this.

so the question is if we do bring colouring back, how do we calculate it? 
arc4random? mask bits off ph_magic? atomic_inc something in the pool? read a 
counter from the pool? shift bits off the page address?



Re: pool page colouring

2014-11-04 Thread Mike Belopuhov
On 5 November 2014 01:12, Mike Belopuhov m...@belopuhov.com wrote:

 well, first of all, right now this is a rather theoretical gain.  we
 need to test it
 to understand if it makes things easier.

err. i meant to say go faster not easier.



Re: pool page colouring

2014-11-04 Thread Mike Belopuhov
On 5 November 2014 00:38, David Gwynne da...@gwynne.id.au wrote:

 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:

 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:


 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.

 there. i fixed it.

 looks like we were both ignorant and wrong. mikeb@ points out this from the 
 original slab paper:

 4.1. Impact of Buffer Address Distribution on Cache
 Utilization

 The address distribution of mid-size buffers can
 affect the system’s overall cache utilization. In par-
 ticular, power-of-two allocators - where all buffers
 are 2 n bytes and are 2 n -byte aligned - are pes-
 simal.* Suppose, for example, that every inode
 (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
 aligned, and that only the first dozen fields of an
 inode (48 bytes) are frequently referenced. Then
 the majority of inode-related memory traffic will be
 at addresses between 0 and 47 modulo 512. Thus
 the cache lines near 512-byte boundaries will be
 heavily loaded while the rest lie fallow. In effect
 only 9% (48/512) of the cache will be usable by
 inodes. Fully-associative caches would not suffer
 this problem, but current hardware trends are toward
 simpler rather than more complex caches.

 4.3. Slab Coloring

 The slab allocator incorporates a simple coloring
 scheme that distributes buffers evenly throughout
 the cache, resulting in excellent cache utilization
 and bus balance. The concept is simple: each time
 a new slab is created, the buffer addresses start at a
 slightly different offset (color) from the slab base
 (which is always page-aligned). For example, for a
 cache of 200-byte objects with 8-byte alignment, the
 first slab’s buffers would be at addresses 0, 200,
 400, ... relative to the slab base. The next slab’s
 buffers would be at offsets 8, 208, 408, ... and so
 on. The maximum slab color is determined by the
 amount of unused space in the slab.


 we run on enough different machines that i think we should consider this.


well, first of all, right now this is a rather theoretical gain.  we
need to test it
to understand if it makes things easier.  to see cache statistics we can use
performance counters, however current pctr code might be a bit out of date.

 so the question is if we do bring colouring back, how do we calculate it?
 arc4random? mask bits off ph_magic? atomic_inc something in the pool?
 read a counter from the pool? shift bits off the page address?

the way i read it is that you have a per-pool running value pr_color that you
increment by the item alignment or native cache line size modulo space
available for every page you are getting from uvm.  however i can see that
it might entail a problem by locating a page header (or was it page boundary?
don't have the code at hand) using simple math.



Re: pool page colouring

2014-11-04 Thread David Gwynne

 On 5 Nov 2014, at 10:12, Mike Belopuhov m...@belopuhov.com wrote:
 
 On 5 November 2014 00:38, David Gwynne da...@gwynne.id.au wrote:
 
 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:
 
 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:
 
 
 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.
 
 there. i fixed it.
 
 looks like we were both ignorant and wrong. mikeb@ points out this from the 
 original slab paper:
 
 4.1. Impact of Buffer Address Distribution on Cache
 Utilization
 
 The address distribution of mid-size buffers can
 affect the system’s overall cache utilization. In par-
 ticular, power-of-two allocators - where all buffers
 are 2 n bytes and are 2 n -byte aligned - are pes-
 simal.* Suppose, for example, that every inode
 (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
 aligned, and that only the first dozen fields of an
 inode (48 bytes) are frequently referenced. Then
 the majority of inode-related memory traffic will be
 at addresses between 0 and 47 modulo 512. Thus
 the cache lines near 512-byte boundaries will be
 heavily loaded while the rest lie fallow. In effect
 only 9% (48/512) of the cache will be usable by
 inodes. Fully-associative caches would not suffer
 this problem, but current hardware trends are toward
 simpler rather than more complex caches.
 
 4.3. Slab Coloring
 
 The slab allocator incorporates a simple coloring
 scheme that distributes buffers evenly throughout
 the cache, resulting in excellent cache utilization
 and bus balance. The concept is simple: each time
 a new slab is created, the buffer addresses start at a
 slightly different offset (color) from the slab base
 (which is always page-aligned). For example, for a
 cache of 200-byte objects with 8-byte alignment, the
 first slab’s buffers would be at addresses 0, 200,
 400, ... relative to the slab base. The next slab’s
 buffers would be at offsets 8, 208, 408, ... and so
 on. The maximum slab color is determined by the
 amount of unused space in the slab.
 
 
 we run on enough different machines that i think we should consider this.
 
 
 well, first of all, right now this is a rather theoretical gain.  we
 need to test it
 to understand if it makes things easier.  to see cache statistics we can use
 performance counters, however current pctr code might be a bit out of date.

pctr is x86 specific though. how would you measure on all the other archs?

 
 so the question is if we do bring colouring back, how do we calculate it?
 arc4random? mask bits off ph_magic? atomic_inc something in the pool?
 read a counter from the pool? shift bits off the page address?
 
 the way i read it is that you have a per-pool running value pr_color that you
 increment by the item alignment or native cache line size modulo space
 available for every page you are getting from uvm.  however i can see that
 it might entail a problem by locating a page header (or was it page boundary?
 don't have the code at hand) using simple math.

the stuff that finds a page header for a page doesnt care about the address of 
individual items within a page, and colouring doesnt change an item being 
wholly contained within a page. ive run with arc4random_uniform coloured 
addresses for a couple of weeks now without problems of that nature.





Re: pool page colouring

2014-11-04 Thread David Gwynne

 On 5 Nov 2014, at 10:27, David Gwynne da...@gwynne.id.au wrote:
 
 
 On 5 Nov 2014, at 10:12, Mike Belopuhov m...@belopuhov.com wrote:
 
 On 5 November 2014 00:38, David Gwynne da...@gwynne.id.au wrote:
 
 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:
 
 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:
 
 
 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.
 
 there. i fixed it.
 
 looks like we were both ignorant and wrong. mikeb@ points out this from the 
 original slab paper:
 
 4.1. Impact of Buffer Address Distribution on Cache
 Utilization
 
 The address distribution of mid-size buffers can
 affect the system’s overall cache utilization. In par-
 ticular, power-of-two allocators - where all buffers
 are 2 n bytes and are 2 n -byte aligned - are pes-
 simal.* Suppose, for example, that every inode
 (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
 aligned, and that only the first dozen fields of an
 inode (48 bytes) are frequently referenced. Then
 the majority of inode-related memory traffic will be
 at addresses between 0 and 47 modulo 512. Thus
 the cache lines near 512-byte boundaries will be
 heavily loaded while the rest lie fallow. In effect
 only 9% (48/512) of the cache will be usable by
 inodes. Fully-associative caches would not suffer
 this problem, but current hardware trends are toward
 simpler rather than more complex caches.
 
 4.3. Slab Coloring
 
 The slab allocator incorporates a simple coloring
 scheme that distributes buffers evenly throughout
 the cache, resulting in excellent cache utilization
 and bus balance. The concept is simple: each time
 a new slab is created, the buffer addresses start at a
 slightly different offset (color) from the slab base
 (which is always page-aligned). For example, for a
 cache of 200-byte objects with 8-byte alignment, the
 first slab’s buffers would be at addresses 0, 200,
 400, ... relative to the slab base. The next slab’s
 buffers would be at offsets 8, 208, 408, ... and so
 on. The maximum slab color is determined by the
 amount of unused space in the slab.
 
 
 we run on enough different machines that i think we should consider this.
 
 
 well, first of all, right now this is a rather theoretical gain.  we
 need to test it
 to understand if it makes things easier.  to see cache statistics we can use
 performance counters, however current pctr code might be a bit out of date.
 
 pctr is x86 specific though. how would you measure on all the other archs?

i would argue that page colouring was in the code before, so it should be now 
unless it can be proven useless. the cost of putting it back in terms of code 
is minimal, the only question has been how do we pick the colour without 
holding the pools mutex?

 
 
 so the question is if we do bring colouring back, how do we calculate it?
 arc4random? mask bits off ph_magic? atomic_inc something in the pool?
 read a counter from the pool? shift bits off the page address?
 
 the way i read it is that you have a per-pool running value pr_color that you
 increment by the item alignment or native cache line size modulo space
 available for every page you are getting from uvm.  however i can see that
 it might entail a problem by locating a page header (or was it page boundary?
 don't have the code at hand) using simple math.
 
 the stuff that finds a page header for a page doesnt care about the address 
 of individual items within a page, and colouring doesnt change an item being 
 wholly contained within a page. ive run with arc4random_uniform coloured 
 addresses for a couple of weeks now without problems of that nature.




Re: pool page colouring

2014-11-04 Thread Ted Unangst
On Wed, Nov 05, 2014 at 09:38, David Gwynne wrote:
 
 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:

 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:


 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.

 there. i fixed it.
 
 looks like we were both ignorant and wrong. mikeb@ points out this from
 the original slab paper:
 
 4.1. Impact of Buffer Address Distribution on Cache
 Utilization
 
 The address distribution of mid-size buffers can
 affect the system’s overall cache utilization. In par-
 ticular, power-of-two allocators - where all buffers
 are 2 n bytes and are 2 n -byte aligned - are pes-
 simal.* Suppose, for example, that every inode
 (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
 aligned, and that only the first dozen fields of an
 inode (48 bytes) are frequently referenced. Then
 the majority of inode-related memory traffic will be
 at addresses between 0 and 47 modulo 512. Thus
 the cache lines near 512-byte boundaries will be
 heavily loaded while the rest lie fallow. In effect
 only 9% (48/512) of the cache will be usable by
 inodes. Fully-associative caches would not suffer
 this problem, but current hardware trends are toward
 simpler rather than more complex caches.

except pool won't align a 300 byte inode on 512 byte boundaries.



Re: pool page colouring

2014-11-04 Thread David Gwynne

 On 5 Nov 2014, at 10:58, Ted Unangst t...@tedunangst.com wrote:
 
 On Wed, Nov 05, 2014 at 09:38, David Gwynne wrote:
 
 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:
 
 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:
 
 
 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.
 
 there. i fixed it.
 
 looks like we were both ignorant and wrong. mikeb@ points out this from
 the original slab paper:
 
 4.1. Impact of Buffer Address Distribution on Cache
 Utilization
 
 The address distribution of mid-size buffers can
 affect the system’s overall cache utilization. In par-
 ticular, power-of-two allocators - where all buffers
 are 2 n bytes and are 2 n -byte aligned - are pes-
 simal.* Suppose, for example, that every inode
 (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
 aligned, and that only the first dozen fields of an
 inode (48 bytes) are frequently referenced. Then
 the majority of inode-related memory traffic will be
 at addresses between 0 and 47 modulo 512. Thus
 the cache lines near 512-byte boundaries will be
 heavily loaded while the rest lie fallow. In effect
 only 9% (48/512) of the cache will be usable by
 inodes. Fully-associative caches would not suffer
 this problem, but current hardware trends are toward
 simpler rather than more complex caches.
 
 except pool won't align a 300 byte inode on 512 byte boundaries.

im not sure thats the point of the example.



Re: pool page colouring

2014-11-04 Thread Ted Unangst
On Tue, Oct 28, 2014 at 16:49, David Gwynne wrote:
 when i shuffled the locking in pools around, page colouring was
 left behind.
 
 page colouring is where you offset items within a page if you have
 enough slack space. the previous implementation simply incremented
 the colour so each new page got the next offset. i didnt do this
 because the page and its items are now initted outside the lock,
 so maintaining that curcolour iterator wasnt as easy.
 
 this sidesteps the curcolor maintenance by just having each page
 randomly pick a colour when it's set up.
 
 tests? ok?

So after all that we're back to this (but updated to apply since I
broke it)? ok, why not? I was trying to save us the trouble, but maybe
that was a bad idea.

ok with me.


 
 Index: kern/subr_pool.c
 ===
 RCS file: /cvs/src/sys/kern/subr_pool.c,v
 retrieving revision 1.163
 diff -u -p -r1.163 subr_pool.c
 --- kern/subr_pool.c  13 Oct 2014 00:12:51 -  1.163
 +++ kern/subr_pool.c  28 Oct 2014 03:05:50 -
 @@ -299,8 +299,7 @@ pool_init(struct pool *pp, size_t size,
 */
 space = POOL_INPGHDR(pp) ? pp-pr_phoffset : pp-pr_pgsize;
 space -= pp-pr_itemsperpage * pp-pr_size;
 - pp-pr_maxcolor = (space / align) * align;
 - pp-pr_curcolor = 0;
 + pp-pr_maxcolors = (space / align) + 1;
 
 pp-pr_nget = 0;
 pp-pr_nfail = 0;
 @@ -750,6 +749,8 @@ pool_p_alloc(struct pool *pp, int flags)
 
 XSIMPLEQ_INIT(ph-ph_itemlist);
 ph-ph_page = addr;
 + ph-ph_colored = addr +
 + arc4random_uniform(pp-pr_maxcolors) * pp-pr_align;
 ph-ph_nmissing = 0;
 arc4random_buf(ph-ph_magic, sizeof(ph-ph_magic));
 #ifdef DIAGNOSTIC
 @@ -760,6 +761,7 @@ pool_p_alloc(struct pool *pp, int flags)
 CLR(ph-ph_magic, POOL_MAGICBIT);
 #endif /* DIAGNOSTIC */
 
 + addr = ph-ph_colored;
 n = pp-pr_itemsperpage;
 while (n--) {
 pi = (struct pool_item *)addr;
 @@ -996,8 +998,8 @@ pool_print_pagelist(struct pool_pagelist
 struct pool_item *pi;
 
 LIST_FOREACH(ph, pl, ph_pagelist) {
 - (*pr)(\t\tpage %p, nmissing %d\n,
 - ph-ph_page, ph-ph_nmissing);
 + (*pr)(\t\tpage %p, color %p, nmissing %d\n,
 + ph-ph_page, ph-ph_colored, ph-ph_nmissing);
 XSIMPLEQ_FOREACH(pi, ph-ph_itemlist, pi_list) {
 if (pi-pi_magic != POOL_IMAGIC(ph, pi)) {
 (*pr)(\t\t\titem %p, magic 0x%lx\n,
 @@ -1021,8 +1023,8 @@ pool_print1(struct pool *pp, const char
 modif++;
 }
 
 - (*pr)(POOL %s: size %u, align %u, roflags 0x%08x\n,
 - pp-pr_wchan, pp-pr_size, pp-pr_align,
 + (*pr)(POOL %s: size %u, align %u, maxcolors %u, roflags 0x%08x\n,
 + pp-pr_wchan, pp-pr_size, pp-pr_align, pp-pr_maxcolors,
 pp-pr_roflags);
 (*pr)(\talloc %p\n, pp-pr_alloc);
 (*pr)(\tminitems %u, minpages %u, maxpages %u, npages %u\n,
 Index: sys/pool.h
 ===
 RCS file: /cvs/src/sys/sys/pool.h,v
 retrieving revision 1.53
 diff -u -p -r1.53 pool.h
 --- sys/pool.h22 Sep 2014 01:04:58 -  1.53
 +++ sys/pool.h28 Oct 2014 03:05:50 -
 @@ -128,8 +128,7 @@ struct pool {
 RB_HEAD(phtree, pool_item_header)
 pr_phtree;
 
 - int pr_maxcolor;/* Cache colouring */
 - int pr_curcolor;
 + u_int   pr_maxcolors;   /* Cache colouring */
 int   pr_phoffset;/* Offset in page of page header */
 
 /*



Re: pool page colouring

2014-10-29 Thread Ted Unangst
On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:

 if you want it to go fast, it would make more sense to set the item
 alignment in pool_init to the size of the cacheline. colouring would then
 become irrelevant from a speed perspective.

There's some sense to this. Like round everything to nearest 64,
except things less than 64 (round to 16 or 32).



Re: pool page colouring

2014-10-29 Thread Ted Unangst
On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:


 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.

there. i fixed it.

Index: kern/subr_pool.c
===
RCS file: /cvs/src/sys/kern/subr_pool.c,v
retrieving revision 1.163
diff -u -p -r1.163 subr_pool.c
--- kern/subr_pool.c13 Oct 2014 00:12:51 -  1.163
+++ kern/subr_pool.c29 Oct 2014 21:49:38 -
@@ -82,7 +82,6 @@ struct pool_item_header {
ph_node;/* Off-page page headers */
int ph_nmissing;/* # of chunks in use */
caddr_t ph_page;/* this page's address */
-   caddr_t ph_colored; /* page's colored address */
u_long  ph_magic;
 };
 #define POOL_MAGICBIT (1  3) /* keep away from perturbed low bits */
@@ -217,7 +216,7 @@ void
 pool_init(struct pool *pp, size_t size, u_int align, u_int ioff, int flags,
 const char *wchan, struct pool_allocator *palloc)
 {
-   int off = 0, space;
+   int off = 0;
unsigned int pgsize = PAGE_SIZE, items;
 #ifdef DIAGNOSTIC
struct pool *iter;
@@ -293,15 +292,6 @@ pool_init(struct pool *pp, size_t size, 
pp-pr_hardlimit_warning_last.tv_usec = 0;
RB_INIT(pp-pr_phtree);
 
-   /*
-* Use the space between the chunks and the page header
-* for cache coloring.
-*/
-   space = POOL_INPGHDR(pp) ? pp-pr_phoffset : pp-pr_pgsize;
-   space -= pp-pr_itemsperpage * pp-pr_size;
-   pp-pr_maxcolor = (space / align) * align;
-   pp-pr_curcolor = 0;
-
pp-pr_nget = 0;
pp-pr_nfail = 0;
pp-pr_nput = 0;
@@ -1232,7 +1222,7 @@ pool_walk(struct pool *pp, int full,
int n;
 
LIST_FOREACH(ph, pp-pr_fullpages, ph_pagelist) {
-   cp = ph-ph_colored;
+   cp = ph-ph_page;
n = ph-ph_nmissing;
 
while (n--) {
@@ -1242,7 +1232,7 @@ pool_walk(struct pool *pp, int full,
}
 
LIST_FOREACH(ph, pp-pr_partpages, ph_pagelist) {
-   cp = ph-ph_colored;
+   cp = ph-ph_page;
n = ph-ph_nmissing;
 
do {
Index: sys/pool.h
===
RCS file: /cvs/src/sys/sys/pool.h,v
retrieving revision 1.53
diff -u -p -r1.53 pool.h
--- sys/pool.h  22 Sep 2014 01:04:58 -  1.53
+++ sys/pool.h  29 Oct 2014 21:49:43 -
@@ -128,8 +128,6 @@ struct pool {
RB_HEAD(phtree, pool_item_header)
pr_phtree;
 
-   int pr_maxcolor;/* Cache colouring */
-   int pr_curcolor;
int pr_phoffset;/* Offset in page of page header */
 
/*



Re: pool page colouring

2014-10-29 Thread Mike Belopuhov
On 29 October 2014 22:52, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:


 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.

 there. i fixed it.


so is there any performance difference?



Re: pool page colouring

2014-10-28 Thread Ted Unangst
On Tue, Oct 28, 2014 at 16:49, David Gwynne wrote:
 when i shuffled the locking in pools around, page colouring was
 left behind.
 
 page colouring is where you offset items within a page if you have
 enough slack space. the previous implementation simply incremented
 the colour so each new page got the next offset. i didnt do this
 because the page and its items are now initted outside the lock,
 so maintaining that curcolour iterator wasnt as easy.
 
 this sidesteps the curcolor maintenance by just having each page
 randomly pick a colour when it's set up.

Would it make more sense to use the page address to pick the color?



Re: pool page colouring

2014-10-28 Thread Mike Belopuhov
On 28 October 2014 17:02, Ted Unangst t...@tedunangst.com wrote:
 On Tue, Oct 28, 2014 at 16:49, David Gwynne wrote:
 when i shuffled the locking in pools around, page colouring was
 left behind.

 page colouring is where you offset items within a page if you have
 enough slack space. the previous implementation simply incremented
 the colour so each new page got the next offset. i didnt do this
 because the page and its items are now initted outside the lock,
 so maintaining that curcolour iterator wasnt as easy.

 this sidesteps the curcolor maintenance by just having each page
 randomly pick a colour when it's set up.

 Would it make more sense to use the page address to pick the color?


Does it actually still make sense to keep page coloring?  Is there still
benefit on modern hardware?



Re: pool page colouring

2014-10-28 Thread David Gwynne

 On 29 Oct 2014, at 2:44 am, Mike Belopuhov m...@belopuhov.com wrote:
 
 On 28 October 2014 17:02, Ted Unangst t...@tedunangst.com wrote:
 On Tue, Oct 28, 2014 at 16:49, David Gwynne wrote:
 when i shuffled the locking in pools around, page colouring was
 left behind.
 
 page colouring is where you offset items within a page if you have
 enough slack space. the previous implementation simply incremented
 the colour so each new page got the next offset. i didnt do this
 because the page and its items are now initted outside the lock,
 so maintaining that curcolour iterator wasnt as easy.
 
 this sidesteps the curcolor maintenance by just having each page
 randomly pick a colour when it's set up.
 
 Would it make more sense to use the page address to pick the color?

yeah. or we could derive it from a counter in the pool like the item or page 
get counters

 Does it actually still make sense to keep page coloring?  Is there still
 benefit on modern hardware?

if you want it to go fast, it would make more sense to set the item alignment 
in pool_init to the size of the cacheline. colouring would then become 
irrelevant from a speed perspective.

however, if colouring is more about perturbing item addresses then it may still 
be worth it. eg, if you only fit one item on a page, without colouring your 
item addresses will always be on a page boundary. moving it around might flush 
out assumptions about low bits in addresses.

i dunno. im fine with either removing colouring altogether or setting it from 
something else completely. i just want a decision to be made cos right now 
ph_color isnt set, which is a bug.