On Fri, 06.01.17 11:59, Marcelo Tosatti (mtosa...@redhat.com) wrote:

> Cache Allocation Technology is a feature on selected recent Intel Xeon
> processors which allows control over L3 cache allocation.

What precisely is the benefit of making this configurable? Can you
describe a basic usecase why I'd want to set this on some service?

> Index: systemd/man/systemd.exec.xml
> ===================================================================
> --- systemd.orig/man/systemd.exec.xml
> +++ systemd/man/systemd.exec.xml

Hmm, this doesn't look like a git patch? We generally prefer
submissions as github PRs these days, btw.

> @@ -233,6 +233,31 @@
>        </varlistentry>
>  
>        <varlistentry>
> +        <term><varname>RDTCacheReservation=</varname></term>
> +
> +        <listitem><para>Creates a L3 CAT reservation in resctrlfs
> +        for the executed processes. Takes a ";" separated list of
> +        tuples (optionally triples) containing type,size and
> +        optionally cacheid:

if I read this, then I am puzzled what "RDT Cache" is, what "L3 CAT"
is. Can you explain?

> +            type=type,size=size,cacheid=id;
> +            type=type,size=size,cacheid=id;...
> +
> +        Where:
> +
> +            type is one of: both, data, code.
> +            size is size in kbytes.
> +            cacheid (optional) is the cacheid for
> +            the reservation.

what is a "cacheid"?

> +
> +        Rules for the parameters:
> +            * type=code must be paired with type=data entry.
> +
> +        See the output of resctrltool for more details.
> +
> +        </para></listitem>
> +      </varlistentry>

This syntax doesn't appear to be in line with how we'd do this in
systemd usually. Ideally, we try to hide the fact that our settings
syntax is one-dimensional as much as we can. Hence, a syntax like this
sounds more appropriate to me:

       RDTCacheReservation=4K code
       RDTCacheReservation=16K data

and so on...

But then again, I have no idea what all of this actually really is,
hence I am not sure my proposed syntax makes much sense.

sizes are generally parsed with parse_bytes() in systemd, so that K,
M, G suffixes are properly recognized...

> +        ExecContext *c = userdata;
> +        int r;
> +        int i;
> +
> +        assert(bus);
> +        assert(reply);
> +        assert(c);
> +
> +        r = sd_bus_message_open_container(reply, 'a', "s");
> +        if (r < 0)
> +                return r;
> +
> +        for (i = 0; i < c->argmidx; i++) {
> +            r = sd_bus_message_append(reply, "s", c->argm[i]);
> +            if (r < 0)
> +                return r;
> +        }

indentation is borked. Please indent to multiples of 8ch.

> +static char* convert_rdt_back(char *argm[], int argmidx)
> +{

Please follow coding style, place opening bracket on same name as
funciton name.

> +     int i, ret;
> +     char *buf, *str;
> +     int size = 0;
> +     int printcomma = 0;
> +     int printsemicolon = 0;
> +     int localmode = 0;

Please use C99 "bool" when you actually mean a boolean.

> +
> +     for (i=0; i < argmidx; i++) {
> +             /* ',', ';', '=' */
> +             size += strlen(argm[i]) + 3;
> +     }
> +
> +     buf = malloc(size);
> +     if (buf == NULL)
> +             return NULL;
> +     memset(buf, 0, size);

Use "buf = new0(char, size)" for allocations of this kind.

> +
> +     str = buf;
> +
> +     /* cache-id specified */
> +     for (i=0; i < argmidx; i++) {
> +             if (strlen(argm[i]) == 10) {
> +                     if (strcmp(argm[i], "--cache-id") == 0)
> +                             localmode = 1;
> +             }
> +     }

Please use "streq()" instead of strcmp() to compare strings for equality.

> +
> +     for (i=0; i<argmidx; i++) {
> +             char *cur = argm[i];
> +
> +             if (strlen(cur) == 0)
> +                     return buf;
> +
> +             if (strlen(cur) == 6) {
> +                     if (strcmp(cur, "--type") == 0) {
> +                             ret = sprintf(str, "type=");
> +                             str = str + ret;
> +                             printcomma = 1;
> +                             continue;
> +                     }
> +
> +                     if (strcmp(cur, "--size") == 0) {
> +                             ret = sprintf(str, "size=");
> +                             str = str + ret;
> +                             if (localmode == 1)
> +                                     printcomma = 1;
> +                             else
> +                                     printsemicolon = 1;
> +
> +                             continue;
> +                     }
> +             }
> +
> +             if (strlen(cur) == 10) {
> +                     if (strcmp(cur, "--cache-id") == 0) {
> +                             ret = sprintf(str, "cache-id=");
> +                             str = str + ret;
> +                             printsemicolon = 1;
> +                             continue;
> +                     }
> +             }
> +
> +             ret = sprintf(str, "%s", cur);
> +             str = str + ret;
> +
> +             if (i != argmidx - 1) {
> +                     if (printsemicolon) {
> +                             ret = sprintf(str, ";");
> +                             str = str + 1;
> +                     } else if (printcomma) {
> +                             ret = sprintf(str, ",");
> +                             str = str + 1;
> +                     }
> +
> +                     printcomma = 0;
> +                     printsemicolon = 0;
> +             }
> +     }
> +
> +     return buf;
> +}
> +
>  int bus_exec_context_set_transient_property(
>                  Unit *u,
>                  ExecContext *c,
> @@ -1377,6 +1492,46 @@ int bus_exec_context_set_transient_prope
>                  }
>  
>                  return 1;
> +        } else if (streq(name, "RDTCacheReservation")) {
> +
> +                r = sd_bus_message_enter_container(message, 'a', "s");
> +                if (r < 0)
> +                        return r;
> +
> +                while ((r = sd_bus_message_enter_container(message, 'r', 
> "s")) > 0) {
> +                        const char *str;
> +
> +                        r = sd_bus_message_read(message, "s", &str);
> +                        if (r < 0)
> +                                return r;
> +
> +                        if (mode != UNIT_CHECK)
> +                            c->argm[c->argmidx++] = strdup(str);

Indentation borked. Missing OOM check.

> +
> +                        r = sd_bus_message_exit_container(message);
> +                        if (r < 0)
> +                                return r;
> +                }
> +
> +             if (mode != UNIT_CHECK) {
> +                     char *conv;
> +
> +                     conv = convert_rdt_back(c->argm, c->argmidx);
> +                     if (conv == NULL) {
> +                             return sd_bus_error_setf(error,
> +                                     SD_BUS_ERROR_NO_MEMORY,
> +                                     "out of memory for rdt string
> convertion");

you may return -ENOMEM here, the right thing will happen.

> +                     }
> +
> +                     unit_write_drop_in_private_format(u, mode, name, 
> "RDTCacheReservation=%s", conv);
> +                     free(conv);
> +             }

indentation really borked.


> +static int fork_exec_resctrltool(Unit *u, char *argv[])
> +{

coding style... opening bracket on same name as function.

> +        int outpipe[2];
> +        int errpipe[2];
> +        pid_t cpid;
> +
> +        pipe(outpipe);
> +        pipe(errpipe);

missing error checks.

> +        cpid = fork();

Umpff... This is not how we work in systemd. We do not glue things
together form the various system tools we shell out to. Sorry, but
this isn't OK to add this way.

If access to the reservation file system is straight-forward, please
add some glue code for it to src/basic/ or so, so that we can do the
accesses directly. Or, alternatively, prepare a proper C library that
we can link against, maintained as part of resctrltool.

Sorry, but this is a total blocker. Either we do the fs access
oursvles, or we call into a properly designed C library we link
against, but shelling out external tools is not an option.

> +        if(cpid == 0) {
> +                int r;
> +
> +                dup2(errpipe[1], STDERR_FILENO);
> +                dup2(outpipe[1], STDOUT_FILENO);

error checking...

> +
> +                r = execve(argv[0], argv, NULL);
> +                if (r == -1) {
> +                        log_open();
> +                        log_unit_error_errno(u, r, "Failed to execve 
> resctrltool, ignoring: %m");
> +                        log_close();
> +                        return -errno;
> +                }
> +        } else {
> +                char buffer[100];

arbitrarily sized buffers... are urks.  please use LINE_MAX or so at least....

> +                fd_set fdset;
> +                int count;
> +                int ret;
> +                int nfds;
> +                int status = 0;
> +                int retst;
> +
> +                ret = waitpid(cpid, &status, 0);
> +                if (ret == -1) {
> +                        log_open();
> +                        log_unit_error_errno(u, ret, "Failed to waitpid 
> resctrltool, ignoring: %m");
> +                        log_close();
> +                        return -errno;
> +                }
> +
> +                if (!WIFEXITED(status)) {
> +                        log_open();
> +                        log_unit_error_errno(u, ret, "resctrltool exits 
> abnormally, ignoring: %m");
> +                        log_close();
> +                        return -1;
> +                } else {
> +                        retst = WEXITSTATUS(status);
> +
> +                        if (retst == 0) {
> +                             return 0;
> +                        }
> +                }
> +
> +                /*
> +                 * error, read stderr and stdout and log.
> +                 */
> +
> +                FD_ZERO(&fdset);
> +                FD_SET(outpipe[0], &fdset);
> +                FD_SET(errpipe[0], &fdset);
> +
> +                if (outpipe[0] > errpipe[0])
> +                        nfds = outpipe[0] + 1;
> +                else
> +                        nfds = errpipe[0] + 1;
> +
> +                ret = select(nfds, &fdset, NULL, NULL, NULL);

select() is not an option. We regularly have to deal with fds > 1024,
and select() doesn't support those. it's not just slow, it simply
won't work.

Really, select() is obsolete IRL.

> +                if (ret == -1) {
> +                        log_open();
> +                        log_unit_error_errno(u, ret, "select error, ignoring 
> RDTCacheReservation: %m");
> +                        log_close();
> +                        return -1;
> +                }
> +
> +                if (FD_ISSET(outpipe[0], &fdset)) {
> +                        count = read(outpipe[0], buffer, sizeof(buffer)-1);
> +                        if (count >= 0) {
> +                                buffer[count] = 0;
> +                                log_open();
> +                                log_unit_error(u, "resctrltool stdout: %s", 
> buffer);
> +                                log_close();
> +                        } else {
> +                                log_open();
> +                                log_unit_error(u, "resctrltool io error\n");
> +                                log_close();
> +                        }
> +                }
> +
> +                if (FD_ISSET(errpipe[0], &fdset)) {
> +                        count = read(errpipe[0], buffer, sizeof(buffer)-1);
> +                        if (count >= 0) {
> +                                buffer[count] = 0;
> +                                log_open();
> +                                log_unit_error(u, "resctrltool stderr: %s", 
> buffer);
> +                                log_close();
> +                        } else {
> +                                log_open();
> +                                log_unit_error(u, "resctrltool io error\n");
> +                                log_close();
> +                        }
> +                }
> +
> +                return retst;


This will deadlock if the tool prints more than PIPE_BUF output...

> +        }
> +
> +    return -1;

Please do not make up fake error codes such as "-1". We pretty
universally follow kernel style "return -EFOOBAR".

> +}
> +
> +static int setup_rdt_cat(Unit *u, const ExecContext *c, pid_t pid)
> +{
> +        int ret, i;
> +        const char *arg[5];
> +        char pidstr[100];

fixed size arrays... If you only write a pid into this, then use 
DECIMAL_STR_MAX(pid_t)...

> +        char *name = u->id;
> +        const char *largm[ARGMSIZE + 4];
> +
> +        sprintf(pidstr, "%d", pid);

Use PID_FMT for this...

> +
> +        arg[0] = "/usr/bin/resctrltool";
> +        arg[1] = "delete";
> +        arg[2] = name;
> +        arg[3] = 0;
> +
> +        ret = fork_exec_resctrltool(u, (char **)arg);
> +        /* we want to ignore 'reservation does not exist'
> +         * errors, so skip error checking entirely.
> +         * any other errors will be caught when trying
> +         * to execute create below
> +         */
> +
> +        memset(largm, 0, sizeof(largm));

Use memzero()...

Please have a look at CODING_STYLE.

And please: the shelling out is not acceptable. Sorry!

Lennart

-- 
Lennart Poettering, Red Hat
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to