Nice! Valmonās legacy lives on. Barret Rhoden <[email protected]> schrieb am Mi. 13. Dez. 2017 um 14:38:
> Hi - > > File this under "better late than never." > > I merged this patch and added a few more patches to make the changes I > mentioned in the code review and to fix a couple other bugs. > > For instance, the topology stuff wasn't identifying NUMA nodes properly > and was turning off - at least for all of my machines. I fixed that up > and a few other topology-related things. > > The most user-visible change is that 'prov' grants cores based on a list > (like perf) instead of single items, and it can spawn processes. That > required moving some perf code to parlib. > > Here's an example of prov with the new 'packed' scheduler: > > bash-4.3$ prov -tc -v39-46 pthread_test 1000 1000000 10 & > > bash-4.3$ prov -s > ... > Core 37, prov: 0(0x0000000000000000) alloc: 0(0x0000000000000000) > Core 38, prov: 0(0x0000000000000000) alloc: 42(0xffff8000150d2100) > Core 39, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 40, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 41, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 42, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 43, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 44, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 45, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 46, prov: 42(0xffff8000150d2100) alloc: 42(0xffff8000150d2100) > Core 47, prov: 0(0x0000000000000000) alloc: 42(0xffff8000150d2100) > Core 48, prov: 0(0x0000000000000000) alloc: 0(0x0000000000000000) > ... > > Note the scheduler gave out the provisioned cores first (as is usual). > When it looked for more cores, it gave out cores that are hyperthreaded > to existing cores (38 is a sibling of 39, and 47 is a sibling of 48). > If it gave out any more cores, it would come from the same socket as > the existing cores. > > I haven't tested it much, so we'll see. Nothing requires a toolchain > rebuild, but some of the parlib headers changes actually affect it. So > rebuild if you're paranoid. > > > The following changes since commit > ef39a1bbb55afb3e89f311e474cd8cb906cb49ca: > > vmm: Provide a fast-path for IPIs in the kernel (2017-12-12 14:51:57 > -0500) > > are available in the Git repository at: > > [email protected]:brho/akaros.git topo > > for you to fetch changes up to 3d407708b8bc4bb5f202aad7582f1a40034e37a3: > > sched: Slightly fix up tests/prov (2017-12-13 17:34:46 -0500) > > ---------------------------------------------------------------- > View this online at: > https://github.com/brho/akaros/compare/ef39a1bbb55a...3d407708b8bc > > ---------------------------------------------------------------- > Barret Rhoden (12): > vmx: Squelch per-core startup messages > x86: Fix topology detection > sched: Remove the idle core interface > sched: Fix packed initialization > sched: When disabling SMT, turn off the odd cores > sched: Touch up the packed scheduler > parlib: Make bitmask.h more compilable > parlib: Move perf's xlib to parlib > perf: Rename the ros_ core_set code > parlib: Move core_sets to parlib > parlib: Move the provisioning of cores to a PID > sched: Slightly fix up tests/prov > > Kevin Klues (1): > Add implementation of packed core alloc strategy > > Kconfig | 8 + > kern/arch/x86/topology.c | 11 +- > kern/arch/x86/vmm/intel/vmx.c | 1 - > kern/include/corealloc_packed.h | 46 ++ > kern/include/corerequest.h | 11 +- > kern/include/schedule.h | 10 - > kern/src/Kbuild | 1 + > kern/src/corealloc_fcfs.c | 82 +--- > kern/src/corealloc_packed.c | 468 > +++++++++++++++++++++ > kern/src/schedule.c | 27 +- > tests/prov.c | 122 ++---- > tools/dev-util/perf/Makefile | 2 +- > tools/dev-util/perf/akaros.h | 50 --- > tools/dev-util/perf/perf.c | 27 +- > tools/dev-util/perf/perf_core.c | 2 +- > tools/dev-util/perf/perf_core.h | 2 +- > tools/dev-util/perf/xlib.c | 149 ------- > tools/dev-util/perf/xlib.h | 14 +- > .../perf/akaros.c => user/parlib/core_set.c | 47 ++- > user/parlib/include/parlib/bitmask.h | 21 +- > user/parlib/include/parlib/core_set.h | 43 ++ > user/parlib/include/parlib/parlib.h | 2 + > user/parlib/include/parlib/x86/bitmask.h | 20 +- > user/parlib/include/parlib/xlib.h | 26 ++ > user/parlib/parlib.c | 19 + > user/parlib/xlib.c | 163 +++++++ > 26 files changed, 894 insertions(+), 480 deletions(-) > create mode 100644 kern/include/corealloc_packed.h > create mode 100644 kern/src/corealloc_packed.c > delete mode 100644 tools/dev-util/perf/akaros.h > rename tools/dev-util/perf/akaros.c => user/parlib/core_set.c (68%) > create mode 100644 user/parlib/include/parlib/core_set.h > create mode 100644 user/parlib/include/parlib/xlib.h > create mode 100644 user/parlib/xlib.c > > Thanks, > Barret > > > On 2015-12-03 at 15:34 Barret Rhoden <[email protected]> wrote: > > Hi - > > > > Couple minor things, listed below. > > > > > > On 2015-11-25 at 06:42 Kevin Klues <[email protected]> wrote: > > > This change introduces the packed core allocation strategy that Valmon > > > and I worked on over the summer. The idea is to keep cores allocated > > > to the same process as tightly packed as possible, while keeping cores > > > allocated to different processes as far away from each other as > > > possible. > > > > > > The existing FCFS core allocation strategy is still the default, but > > > the new packed strategy can be selected at compile time via a config > > > variable (CONFIG_COREALLOC_PACKED). > > > > > > The plan is to keep the existing strategy as the default for now, but > > > migrate to the new strategy once we're able to concretely measure the > > > benefits of it. > > > > > > The changes in this request can be viewed online at: > > > > > > https://github.com/brho/akaros/compare/abe793d...0780d03 > > > > > > The following changes since commit > > > abe793d7c7507cbd6732ed2d794916ae295523ad: > > > > > > Added test for devarch MSR file (2015-11-24 15:39:05 -0500) > > > > > > are available in the git repository at: > > > > > > [email protected]:klueska/akaros corealloc-packed > > > > > From eb630798aa2ac398870687ab291d1fbe04034457 Mon Sep 17 00:00:00 2001 > > > From: Kevin Klues <[email protected]> > > > Date: Mon, 5 Oct 2015 19:42:22 -0700 > > > Subject: Add implementation of packed core alloc strategy > > > > > --- /dev/null > > > +++ b/kern/src/corealloc_packed.c > > > @@ -0,0 +1,508 @@ > > > > > +enum pnode_type { CORE, CPU, SOCKET, NUMA, MACHINE, NUM_NODE_TYPES }; > > > +static char pnode_label[5][8] = { "CORE", "CPU", "SOCKET", "NUMA", > "MACHINE" }; > > > > The 5 and 8 seem a little janky. I guess 5 is NUM_NODE_TYPES and 8 is > the max > > of the strlen + 1? > > > > > +/* Create a node and initialize it. This code assumes that child are > created > > > + * before parent nodes. */ > > > +static void init_nodes(int type, int num, int nchildren) > > > +{ > > > + /* Initialize the lookup table for this node type. */ > > > + num_nodes[type] = num; > > > + node_lookup[type] = all_nodes; > > > + for (int i = CORE; i < type; i++) > > > + node_lookup[type] += num_nodes[i]; > > > > This is a little mystical at first. From this, it looks like all_nodes > is laid > > out such that all of the nodes are: CORES, then the CPUS, then the > SOCKETS, then > > NUMA, then one for MACHINE. The nodes at any given layer are later > broken up > > into chunks of n and assigned to their parents (down below). Can yinz > put > > something in here that explains this layout? > > > > > +/* Allocate a table of distances from one core to an other. Cores on > the same > > > + * cpu have a distance of 1; same socket have a distance of 2; same > numa -> 3; > > > + * same machine -> 4; */ > > > +static void init_core_distances(void) > > > +{ > > > + core_distance = kzmalloc(num_cores * sizeof(int*), 0); > > > + if (core_distance == NULL) > > > + panic("Out of memory!\n"); > > > > Can use KMALLOC_WAIT in lieu of panicking. > > > > > + for (int i = 0; i < num_cores; i++) { > > > + core_distance[i] = kzmalloc(num_cores * sizeof(int), 0); > > > + if (core_distance[i] == NULL) > > > + panic("Out of memory!\n"); > > > > Can use KMALLOC_WAIT in lieu of panicking. > > > > > + } > > > + for (int i = 0; i < num_cores; i++) { > > > + for (int j = 0; j < num_cores; j++) { > > > + for (int k = CPU; k <= MACHINE; k++) { > > > + if (i/num_descendants[k][CORE] == > > > + j/num_descendants[k][CORE]) { > > > + core_distance[i][j] = k; > > > > This took me a little bit to sort out. > > > > Two cores are have the same level id (and thus at distance k) if their > IDs > > divided by the number of cores per level are the same, due to how we > number our > > cores. i.e with some suitably documented helper: > > > > int get_level_id(int coreid, int level) > > { > > return coreid / num_descendants[level][CORE]; > > } > > > > if (get_level_id(i, k) == get_level_id(j, k)) > > core_distance[i][j] = k; > > > > Can yinz either use a helper or add a line or two to explain that? > > > > > > > +/* Initialize any data assocaited with doing core allocation. */ > > > +void corealloc_init(void) > > > +{ > > > + void *nodes_and_cores; > > > + > > > + /* Allocate a flat array of nodes. */ > > > + total_nodes = num_cores + num_cpus + num_sockets + num_numa + 1; > > > + nodes_and_cores = kmalloc(total_nodes * sizeof(struct sched_pnode) > + > > > + num_cores * sizeof(struct sched_pcore), > 0); > > > > That kmalloc should take either KMALLOC_WAIT or check the return value. > > > > > > > +/* Returns the sum of the distances from one core to all cores in a > list. */ > > > +static int cumulative_core_distance(struct sched_pcore *c, > > > + struct sched_pcore_tailq cl) > > > +{ > > > + int d = 0; > > > + struct sched_pcore *temp = NULL; > > > + > > > + TAILQ_FOREACH(temp, &cl, alloc_next) { > > > > This assumes that the cl list is a list of allocated cores, not any > generic > > list. Yinz only use it on allocated lists, so that's fine. We probably > need a > > comment about that though. > > > > > > > +/* Track the pcore properly when it is allocated to p. This code > assumes that > > > + * the scheduler that uses it holds a lock for the duration of the > call. */ > > > +void __track_core_alloc(struct proc *p, uint32_t pcoreid) > > > +{ > > > + struct sched_pcore *spc; > > > + > > > + assert(pcoreid < num_cores); /* catch bugs */ > > > + spc = pcoreid2spc(pcoreid); > > > + assert(spc->alloc_proc != p); /* corruption or double-alloc */ > > > + spc->alloc_proc = p; > > > + /* if the pcore is prov to them and now allocated, move lists */ > > > + if (spc->prov_proc == p) { > > > + TAILQ_REMOVE(&p->ksched_data.crd.prov_not_alloc_me, spc, > prov_next); > > > + TAILQ_INSERT_TAIL(&p->ksched_data.crd.prov_alloc_me, spc, > prov_next); > > > + } > > > + /* Actually allocate the core, removing it from the idle core > list. */ > > > + TAILQ_INSERT_TAIL(&p->ksched_data.crd.alloc_me, spc, alloc_next); > > > + TAILQ_REMOVE(&idlecores, spc, alloc_next); > > > > The order of these two might be backwards. You probably want to remove > it > > before adding it to another list, esp since alloc_next is used for both > lists. > > > > > > > +/* Get an idle core from our pcore list and return its core_id. Don't > > > + * consider the chosen core in the future when handing out cores to a > > > + * process. This code assumes that the scheduler that uses it holds a > lock > > > + * for the duration of the call. This will not give out provisioned > cores. */ > > > +int __get_any_idle_core(void) > > > +{ > > > + struct sched_pcore *spc; > > > + int ret = -1; > > > + > > > + for (int i = 0; i < num_cores; i++) { > > > + struct sched_pcore *c = &all_pcores[i]; > > > + > > > + if (spc->alloc_proc == NULL) { > > > + spc->alloc_proc = UNNAMED_PROC; > > > + ret = spc->core_info->core_id; > > > > Does this code also need to do something with the refcnt, like yinz did > when > > initializing the scheduler and removing cores? > > > > The steps for removing a core from consideration for allocation could > use a > > helper, which we can then use here. And we'd need one for returning it > (for > > put_idle_core). > > > > Also, do these __get_ routines need to remove the core from the idle > core list? > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Akaros" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- ~Kevin -- You received this message because you are subscribed to the Google Groups "Akaros" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
