Hi -

Couple minor things, listed below.


On 2015-11-25 at 06:42 Kevin Klues <[email protected]> wrote:
> This change introduces the packed core allocation strategy that Valmon
> and I worked on over the summer. The idea is to keep cores allocated
> to the same process as tightly packed as possible, while keeping cores
> allocated to different processes as far away from each other as
> possible.
> 
> The existing FCFS core allocation strategy is still the default, but
> the new packed strategy can be selected at compile time via a config
> variable (CONFIG_COREALLOC_PACKED).
> 
> The plan is to keep the existing strategy as the default for now, but
> migrate to the new strategy once we're able to concretely measure the
> benefits of it.
> 
> The changes in this request can be viewed online at:
> 
>     https://github.com/brho/akaros/compare/abe793d...0780d03
> 
> The following changes since commit
> abe793d7c7507cbd6732ed2d794916ae295523ad:
> 
>   Added test for devarch MSR file (2015-11-24 15:39:05 -0500)
> 
> are available in the git repository at:
> 
>   [email protected]:klueska/akaros corealloc-packed

> From eb630798aa2ac398870687ab291d1fbe04034457 Mon Sep 17 00:00:00 2001
> From: Kevin Klues <[email protected]>
> Date: Mon, 5 Oct 2015 19:42:22 -0700
> Subject: Add implementation of packed core alloc strategy

> --- /dev/null
> +++ b/kern/src/corealloc_packed.c
> @@ -0,0 +1,508 @@

> +enum pnode_type { CORE, CPU, SOCKET, NUMA, MACHINE, NUM_NODE_TYPES };
> +static char pnode_label[5][8] = { "CORE", "CPU", "SOCKET", "NUMA", "MACHINE" 
> };

The 5 and 8 seem a little janky.  I guess 5 is NUM_NODE_TYPES and 8 is the max
of the strlen + 1?

> +/* Create a node and initialize it. This code assumes that child are created
> + * before parent nodes. */
> +static void init_nodes(int type, int num, int nchildren)
> +{
> +     /* Initialize the lookup table for this node type. */
> +     num_nodes[type] = num;
> +     node_lookup[type] = all_nodes;
> +     for (int i = CORE; i < type; i++)
> +             node_lookup[type] += num_nodes[i];

This is a little mystical at first.  From this, it looks like all_nodes is laid
out such that all of the nodes are: CORES, then the CPUS, then the SOCKETS, then
NUMA, then one for MACHINE.  The nodes at any given layer are later broken up
into chunks of n and assigned to their parents (down below).  Can yinz put
something in here that explains this layout?

> +/* Allocate a table of distances from one core to an other. Cores on the same
> + * cpu have a distance of 1; same socket have a distance of 2; same numa -> 
> 3;
> + * same machine -> 4; */
> +static void init_core_distances(void)
> +{
> +     core_distance = kzmalloc(num_cores * sizeof(int*), 0);
> +     if (core_distance == NULL)
> +             panic("Out of memory!\n");

Can use KMALLOC_WAIT in lieu of panicking.

> +     for (int i = 0; i < num_cores; i++) {
> +             core_distance[i] = kzmalloc(num_cores * sizeof(int), 0);
> +             if (core_distance[i] == NULL)
> +                     panic("Out of memory!\n");

Can use KMALLOC_WAIT in lieu of panicking.

> +     }
> +     for (int i = 0; i < num_cores; i++) {
> +             for (int j = 0; j < num_cores; j++) {
> +                     for (int k = CPU; k <= MACHINE; k++) {
> +                             if (i/num_descendants[k][CORE] ==
> +                                     j/num_descendants[k][CORE]) {
> +                                     core_distance[i][j] = k;

This took me a little bit to sort out.

Two cores are have the same level id (and thus at distance k) if their IDs
divided by the number of cores per level are the same, due to how we number our
cores.  i.e with some suitably documented helper:

        int get_level_id(int coreid, int level)
        {
                return coreid / num_descendants[level][CORE];
        }

        if (get_level_id(i, k) == get_level_id(j, k))
                core_distance[i][j] = k;

Can yinz either use a helper or add a line or two to explain that?


> +/* Initialize any data assocaited with doing core allocation. */
> +void corealloc_init(void)
> +{
> +     void *nodes_and_cores;
> +
> +     /* Allocate a flat array of nodes. */
> +     total_nodes = num_cores + num_cpus + num_sockets + num_numa + 1;
> +     nodes_and_cores = kmalloc(total_nodes * sizeof(struct sched_pnode) +
> +                               num_cores * sizeof(struct sched_pcore), 0);

That kmalloc should take either KMALLOC_WAIT or check the return value.


> +/* Returns the sum of the distances from one core to all cores in a list. */
> +static int cumulative_core_distance(struct sched_pcore *c,
> +                                    struct sched_pcore_tailq cl)
> +{
> +     int d = 0;
> +     struct sched_pcore *temp = NULL;
> +
> +     TAILQ_FOREACH(temp, &cl, alloc_next) {

This assumes that the cl list is a list of allocated cores, not any generic
list.  Yinz only use it on allocated lists, so that's fine.  We probably need a
comment about that though.


> +/* Track the pcore properly when it is allocated to p. This code assumes that
> + * the scheduler that uses it holds a lock for the duration of the call. */
> +void __track_core_alloc(struct proc *p, uint32_t pcoreid)
> +{
> +     struct sched_pcore *spc;
> +
> +     assert(pcoreid < num_cores);    /* catch bugs */
> +     spc = pcoreid2spc(pcoreid);
> +     assert(spc->alloc_proc != p);   /* corruption or double-alloc */
> +     spc->alloc_proc = p;
> +     /* if the pcore is prov to them and now allocated, move lists */
> +     if (spc->prov_proc == p) {
> +             TAILQ_REMOVE(&p->ksched_data.crd.prov_not_alloc_me, spc, 
> prov_next);
> +             TAILQ_INSERT_TAIL(&p->ksched_data.crd.prov_alloc_me, spc, 
> prov_next);
> +     }
> +     /* Actually allocate the core, removing it from the idle core list. */
> +     TAILQ_INSERT_TAIL(&p->ksched_data.crd.alloc_me, spc, alloc_next);
> +     TAILQ_REMOVE(&idlecores, spc, alloc_next);

The order of these two might be backwards.  You probably want to remove it
before adding it to another list, esp since alloc_next is used for both lists.


> +/* Get an idle core from our pcore list and return its core_id. Don't
> + * consider the chosen core in the future when handing out cores to a
> + * process. This code assumes that the scheduler that uses it holds a lock
> + * for the duration of the call. This will not give out provisioned cores. */
> +int __get_any_idle_core(void)
> +{
> +     struct sched_pcore *spc;
> +     int ret = -1;
> +
> +     for (int i = 0; i < num_cores; i++) {
> +             struct sched_pcore *c = &all_pcores[i];
> +
> +             if (spc->alloc_proc == NULL) {
> +                     spc->alloc_proc = UNNAMED_PROC;
> +                     ret = spc->core_info->core_id;

Does this code also need to do something with the refcnt, like yinz did when
initializing the scheduler and removing cores?

The steps for removing a core from consideration for allocation could use a
helper, which we can then use here.  And we'd need one for returning it (for
put_idle_core).

Also, do these __get_ routines need to remove the core from the idle core list?


-- 
You received this message because you are subscribed to the Google Groups 
"Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to