* Lai Jiangshan ([email protected]) wrote: > On 11/22/2011 06:20 PM, Mathieu Desnoyers wrote: > > * Lai Jiangshan ([email protected]) wrote: > >> The old memory mangement is named as LFHT_MEMORY_ORDER. > >> > >> Signed-off-by: Lai Jiangshan <[email protected]> > >> --- > >> rculfhash.c | 116 > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- > >> 1 files changed, 111 insertions(+), 5 deletions(-) > >> > >> diff --git a/rculfhash.c b/rculfhash.c > >> index b7be893..ae1177f 100644 > >> --- a/rculfhash.c > >> +++ b/rculfhash.c > >> @@ -172,6 +172,10 @@ > >> #define dbg_printf(fmt, args...) > >> #endif > >> > >> +#if !defined(LFHT_MEMORY_ORDER) && !defined(LFHT_MEMORY_CHUNK) > >> +#define LFHT_MEMORY_ORDER > >> +#endif > >> + > >> /* > >> * Split-counters lazily update the global counter each 1024 > >> * addition/removal. It automatically keeps track of resize required. > >> @@ -195,6 +199,8 @@ > >> #define MAX_TABLE_ORDER 64 > >> #endif > >> > >> +#define MAX_CHUNK_TABLE (1UL << 10) > >> + > >> /* > >> * Minimum number of bucket nodes to touch per thread to parallelize > >> grow/shrink. > >> */ > >> @@ -247,6 +253,7 @@ struct rcu_table { > >> unsigned long resize_target; > >> int resize_initiated; > >> > >> +#ifdef LFHT_MEMORY_ORDER > >> /* > >> * Contains the per order-index-level bucket node table. The size > >> * of each bucket node table is half the number of hashes contained > >> @@ -255,6 +262,7 @@ struct rcu_table { > >> * levels to improve cache locality for small index orders. > >> */ > >> struct cds_lfht_node *tbl[MAX_TABLE_ORDER]; > >> +#endif > >> }; > >> > >> /* > >> @@ -289,6 +297,16 @@ struct cds_lfht { > >> pthread_attr_t *resize_attr; /* Resize threads attributes */ > >> long count; /* global approximate item count */ > >> struct ht_items_count *split_count; /* split item count */ > >> + > >> +#ifdef LFHT_MEMORY_CHUNK > >> + /* > >> + * Contains the bucket node chunks. The size of each bucket node > >> + * chunk is ->min_alloc_size(avoid to alloc chunks with different > >> + * size). chunks improve cache locality for small index orders > >> + * and improve lookup_bucket() for the archs without hardware fls(). > >> + */ > >> + struct cds_lfht_node *tbl[0]; > >> +#endif > >> }; > >> > >> /* > >> @@ -753,6 +771,93 @@ unsigned long > >> _uatomic_xchg_monotonic_increase(unsigned long *ptr, > >> return old2; > >> } > >> > >> +#ifdef LFHT_MEMORY_CHUNK > >> +static > >> +struct cds_lfht *_cds_lfht_alloc(unsigned long min_alloc_size, > >> + unsigned long max_size) > >> +{ > >> + struct cds_lfht *ht; > >> + > >> + min_alloc_size = max(min_alloc_size, max_size / MAX_CHUNK_TABLE); > >> + ht = calloc(1, sizeof(struct cds_lfht) + > >> + sizeof(ht->tbl[0]) * (max_size / min_alloc_size)); > > > > Hrm, this first level of indirection quickly becomes much larger than > > the order-based table. Have you measured gains with this approach ? > > > > Can you detail the use-case for this approach ? > > I guess some times max_bucket_size/normal_bucket_size is not very high, > (I also guess it is quit real/often in practice) > so the users can use bigger min_alloc_size(bigger chunk size) and smaller > chunk pointer table. > > (see "History" of memory management) > > > > > Have you measured the overhead of software fls() compared to the > > overhead of increases cache footprint of the larger table for the > > chunk-based approach ? > > Sorry, I also guessed. So I just said this management is simpler and better > when there is no hardware fls(). not "better performance", let the users > make the decision. > > It seems it is very hard to measure. > > "History" of memory management: > > I was thinking what is not good when this rculfhash is merged into linux > kernel. > the bigger order memory allocation will be an obstacle. we can not allocate > big memory in kernel, we have to allocate small pieces(pages), and we have two > choices: > 1) (v)map the pages as continuous order memory. > 2) use discontinuous pages as chunks. > > 2) ===========> I got this LFHT_MEMORY_CHUNK. > > 1) ===========> since we need to do mapping, why not map them all continuous, > and map them when demanded, I got LFHT_MEMORY_RESERVED. > (it introduces a new obstacle, we require big memory space, > but 64BITS can provide it easier.)
Ah, I see. So chunk on 32-bit (with page allocation), and reserved on 64-bit. Yep, this makes sense. So having these as plugins would be really interesting. Thanks, Mathieu > > The paper of Ori Shalev and Nir Shavit are actually a special > LFHT_MEMORY_RESERVED. > > -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
