Re: [systemd-devel] [PATCH] udev rules: add udev rule to create /dev/ptp_kvm
On Sun, Feb 26, 2017 at 09:52:18PM +0100, Lennart Poettering wrote: > On Thu, 23.02.17 22:20, Marcelo Tosatti (mtosa...@redhat.com) wrote: > > > > > Its necessary to specify the KVM PTP device name in userspace. > > > > In case a network card with PTP device is assigned to the guest, > > it might be the case that KVM PTP gets /dev/ptp0 instead of /dev/ptp1. > > > > Fix a device name for the KVM PTP device. > > What's the symlink precisely good for, can you elaborate? You want to configure Chrony to use PTP in the guest to sync with the host. You need to add a entry to /etc/chrony.conf pointing to "/dev/ptp0", the ptp_kvm device. However, it might be the case that a PCI assigned device has a PTP clock, and it can be registered as "/dev/ptp0" and ptp_kvm as "/dev/ptp1". > Also, what's the benefit of shipping this upstream? Why not ship that > rule with kvm? qemu-kvm package? Sure i can do that, but then all distributions have to do the same with their own packages. Why not do it in a centralized location where everyone benefits? (other virtual devices such as virtio-serial already have aliases upstream). > Lennart > > -- > Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH] udev rules: add udev rule to create /dev/ptp_kvm
Its necessary to specify the KVM PTP device name in userspace. In case a network card with PTP device is assigned to the guest, it might be the case that KVM PTP gets /dev/ptp0 instead of /dev/ptp1. Fix a device name for the KVM PTP device. Signed-off-by: Marcelo Tosatti diff --git a/rules/50-udev-default.rules b/rules/50-udev-default.rules index e9eeb85..3347c8c 100644 --- a/rules/50-udev-default.rules +++ b/rules/50-udev-default.rules @@ -74,4 +74,6 @@ KERNEL=="tun", MODE="0666", OPTIONS+="static_node=net/tun" KERNEL=="fuse", MODE="0666", OPTIONS+="static_node=fuse" +SUBSYSTEM=="ptp", ATTR{clock_name}=="KVM virtual PTP", SYMLINK += "ptp_kvm" + LABEL="default_end" ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)
On Fri, Jan 06, 2017 at 07:51:05PM +0100, Lennart Poettering wrote: > On Fri, 06.01.17 11:59, Marcelo Tosatti (mtosa...@redhat.com) wrote: > > > Cache Allocation Technology is a feature on selected recent Intel Xeon > > processors which allows control over L3 cache allocation. > > What precisely is the benefit of making this configurable? Can you > describe a basic usecase why I'd want to set this on some service? http://www.intel.com/content/www/us/en/communications/cache-allocation-technology-white-paper.html "On systems with multiple workloads running simultaneously, which heavily utilize the cache; system administrators can utilize CAT to improve performance. CAT can be used to better utilize the caches and as a result some applications are more likely to hit the cache which results in fewer accesses to system memory. This results in lower latency and greater performance." For example, with a job with large memory reaccess distances (where caching hardly brings any benefit), you can limit that job to reclaim a small portion of L3 cache (so you don't throw away data from other jobs from L3 cache). > > > Index: systemd/man/systemd.exec.xml > > === > > --- systemd.orig/man/systemd.exec.xml > > +++ systemd/man/systemd.exec.xml > > Hmm, this doesn't look like a git patch? We generally prefer > submissions as github PRs these days, btw. > > > @@ -233,6 +233,31 @@ > > > > > > > > +RDTCacheReservation= > > + > > +Creates a L3 CAT reservation in resctrlfs > > +for the executed processes. Takes a ";" separated list of > > +tuples (optionally triples) containing type,size and > > +optionally cacheid: > > if I read this, then I am puzzled what "RDT Cache" is, what "L3 CAT" > is. Can you explain? RDT: Resource Director Technology(RDT) CAT: Cache Allocation Technology (CAT), a subcomponent of RDT (which includes other things such as CMT: Cache Monitoring Technology). > > > +type=type,size=size,cacheid=id; > > +type=type,size=size,cacheid=id;... > > + > > +Where: > > + > > +type is one of: both, data, code. > > +size is size in kbytes. > > +cacheid (optional) is the cacheid for > > +the reservation. > > what is a "cacheid"? Cache IDs - On current generation systems there is one L3 cache per socket and L2 caches are generally just shared by the hyperthreads on a core, but this isn't an architectural requirement. We could have multiple separate L3 caches on a socket, multiple cores could share an L2 cache. So instead of using "socket" or "core" to define the set of logical cpus sharing a resource we use a "Cache ID". At a given cache level this will be a unique number across the whole system (but it isn't guaranteed to be a contiguous sequence, there may be gaps). To find the ID for each logical CPU look in /sys/devices/system/cpu/cpu*/cache/index*/id > > > + > > +Rules for the parameters: > > +* type=code must be paired with type=data entry. > > + > > +See the output of resctrltool for more details. > > + > > + > > + > > This syntax doesn't appear to be in line with how we'd do this in > systemd usually. Ideally, we try to hide the fact that our settings > syntax is one-dimensional as much as we can. Hence, a syntax like this > sounds more appropriate to me: > >RDTCacheReservation=4K code >RDTCacheReservation=16K data > > and so on... What you suggest is to essentially to split RDTCacheReservation=type=type,size=size,cacheid=id;type=type,size=size,cacheid=id Into RDTCacheReservation=type=type,size=size,cacheid=id RDTCacheReservation=type=type,size=size,cacheid=id (yes sure can get rid of the commas as well, looks better). > But then again, I have no idea what all of this actually really is, > hence I am not sure my proposed syntax makes much sense. > > sizes are generally parsed with parse_bytes() in systemd, so that K, > M, G suffixes are properly recognized... > > > +ExecContext *c = userdata; > > +int r; > > +int i; > > + > > +assert(bus); > > +assert(reply); > > +assert(c); > > + > > +r = sd_bus_message_open_container(reply, 'a', "s"); > > +if (r < 0) > > +return r; > > + > > +for (i = 0; i < c->argmidx; i+
Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)
GOn Fri, Jan 06, 2017 at 07:54:48PM +0100, Lennart Poettering wrote: > On Fri, 06.01.17 13:51, Marcelo Tosatti (mtosa...@redhat.com) wrote: > > > > This really doesn't look pretty, neither the approach nor the > > > implementation... > > > > Suggestions to improve the code or the approach are welcome. > > > > > Is the option actually so complex that calling resctrltool is the only way > > > to adjust it? What about writing to the resctrlfs directly? > > > > You'll have to deal with the issues that resctrltool deals with, > > namely: > > > > 1) Filesystem locking. > > 2) Reading in every directory and the default > > directory. > > 3) Converting the reservation request to proper sizes. > > 4) Converting: > > type=both --> type=data/type=code > > > > type=data/type=code --> type=both > > > > 4) Finding free space for the reservation. > > 5) Adjusting the default group reservation. > > > > Since this steps must be performed by every user of > > CAT (including libvirt which plans to execute resctrltool > > as well), it was decided its better to maintain this logic > > in a centralized place. > > Centralized place is fine, but please do this via a proper C library, > not some tool to shell out... > > Hmm, if I see that correctly, then resctrltool is a python script? I > am particularly concerned about making systemd indirectly depend on > Python this way, in particular for something that isn't precisely a > core functionality of systemd. Yes its a Python script. Sure, will write a C library for resctrltool-0.2 and repost the patches. Thanks. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)
On Fri, Jan 06, 2017 at 01:51:11PM -0200, Marcelo Tosatti wrote: > On Fri, Jan 06, 2017 at 05:26:36PM +0200, Mantas Mikulėnas wrote: > > On Fri, Jan 6, 2017 at 3:59 PM, Marcelo Tosatti wrote: > > > > > > > > > > > Cache Allocation Technology is a feature on selected recent Intel Xeon > > > processors which allows control over L3 cache allocation. > > > > > > Kernel support has been merged to the upstream kernel, via a filesystem > > > resctrlfs. > > > > > > On top of that, a userspace utility, resctrltool has been written > > > to facilitate writing applications and using the filesystem > > > interface (see the rationale at > > > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html). > > > > > > This patch adds a new option to systemd, RDTCacheReservation, > > > to allow configuration of CAT via resctrltool. > > > > > > See the first hunk of the patch for a description of the option > > > > > > This really doesn't look pretty, neither the approach nor the > > implementation... > > Suggestions to improve the code or the approach are welcome. > > > Is the option actually so complex that calling resctrltool is the only way > > to adjust it? What about writing to the resctrlfs directly? > > You'll have to deal with the issues that resctrltool deals with, > namely: > > 1) Filesystem locking. > 2) Reading in every directory and the default > directory. > 3) Converting the reservation request to proper sizes. > 4) Converting: > type=both --> type=data/type=code > > type=data/type=code --> type=both > > 4) Finding free space for the reservation. > 5) Adjusting the default group reservation. > > Since this steps must be performed by every user of > CAT (including libvirt which plans to execute resctrltool > as well), it was decided its better to maintain this logic > in a centralized place. > > Why do you prefer writing to to resctrlfs directly? > > > Also, it would be nicer to submit the patches as GitHub pull requests > > instead. > > Sure can do that after the reviews... Thanks for your comments! > > > > > > > > > +static char* convert_rdt_back(char *argm[], int argmidx) > > > +{ > > > + int i, ret; > > > + char *buf, *str; > > > + int size = 0; > > > + int printcomma = 0; > > > + int printsemicolon = 0; > > > + int localmode = 0; > > > + > > > + for (i=0; i < argmidx; i++) { > > > + /* ',', ';', '=' */ > > > + size += strlen(argm[i]) + 3; > > > + } > > > + > > > + buf = malloc(size); > > > + if (buf == NULL) > > > + return NULL; > > > + memset(buf, 0, size); > > > + > > > + str = buf; > > > + > > > + /* cache-id specified */ > > > + for (i=0; i < argmidx; i++) { > > > + if (strlen(argm[i]) == 10) { > > > + if (strcmp(argm[i], "--cache-id") == 0) > > > + localmode = 1; > > > + } > > > + } > > > + > > > + for (i=0; i > > + char *cur = argm[i]; > > > + > > > + if (strlen(cur) == 0) > > > + return buf; > > > + > > > + if (strlen(cur) == 6) { > > > + if (strcmp(cur, "--type") == 0) { > > > + ret = sprintf(str, "type="); > > > + str = str + ret; > > > + printcomma = 1; > > > + continue; > > > + } > > > + etc. > > > + } > > > + > > > + return buf; > > > +} > > > + > > > > > > > Converting from foo;bar to argv[] and back is a bit horrible. > > Its a very simple function. > > > It seems to > > me that it would be *much* simpler and easier to maintain if you stored > > only actual parameters in the unit (e.g. as an array of > > {type,size,cache_id}) and only generated argv at the moment it was > > necessary, i.e. right before spawning resctrltool. After all, that's the > > only place you need it. *And* that way you w
Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)
On Fri, Jan 06, 2017 at 05:26:36PM +0200, Mantas Mikulėnas wrote: > On Fri, Jan 6, 2017 at 3:59 PM, Marcelo Tosatti wrote: > > > > > > > Cache Allocation Technology is a feature on selected recent Intel Xeon > > processors which allows control over L3 cache allocation. > > > > Kernel support has been merged to the upstream kernel, via a filesystem > > resctrlfs. > > > > On top of that, a userspace utility, resctrltool has been written > > to facilitate writing applications and using the filesystem > > interface (see the rationale at > > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html). > > > > This patch adds a new option to systemd, RDTCacheReservation, > > to allow configuration of CAT via resctrltool. > > > > See the first hunk of the patch for a description of the option > > > This really doesn't look pretty, neither the approach nor the > implementation... Suggestions to improve the code or the approach are welcome. > Is the option actually so complex that calling resctrltool is the only way > to adjust it? What about writing to the resctrlfs directly? You'll have to deal with the issues that resctrltool deals with, namely: 1) Filesystem locking. 2) Reading in every directory and the default directory. 3) Converting the reservation request to proper sizes. 4) Converting: type=both --> type=data/type=code type=data/type=code --> type=both 4) Finding free space for the reservation. 5) Adjusting the default group reservation. Since this steps must be performed by every user of CAT (including libvirt which plans to execute resctrltool as well), it was decided its better to maintain this logic in a centralized place. Why do you prefer writing to to resctrlfs directly? > Also, it would be nicer to submit the patches as GitHub pull requests > instead. Sure can do that after the reviews... Thanks for your comments! > > > > > +static char* convert_rdt_back(char *argm[], int argmidx) > > +{ > > + int i, ret; > > + char *buf, *str; > > + int size = 0; > > + int printcomma = 0; > > + int printsemicolon = 0; > > + int localmode = 0; > > + > > + for (i=0; i < argmidx; i++) { > > + /* ',', ';', '=' */ > > + size += strlen(argm[i]) + 3; > > + } > > + > > + buf = malloc(size); > > + if (buf == NULL) > > + return NULL; > > + memset(buf, 0, size); > > + > > + str = buf; > > + > > + /* cache-id specified */ > > + for (i=0; i < argmidx; i++) { > > + if (strlen(argm[i]) == 10) { > > + if (strcmp(argm[i], "--cache-id") == 0) > > + localmode = 1; > > + } > > + } > > + > > + for (i=0; i > + char *cur = argm[i]; > > + > > + if (strlen(cur) == 0) > > + return buf; > > + > > + if (strlen(cur) == 6) { > > + if (strcmp(cur, "--type") == 0) { > > + ret = sprintf(str, "type="); > > + str = str + ret; > > + printcomma = 1; > > + continue; > > + } > > + etc. > > + } > > + > > + return buf; > > +} > > + > > > > Converting from foo;bar to argv[] and back is a bit horrible. Its a very simple function. > It seems to > me that it would be *much* simpler and easier to maintain if you stored > only actual parameters in the unit (e.g. as an array of > {type,size,cache_id}) and only generated argv at the moment it was > necessary, i.e. right before spawning resctrltool. After all, that's the > only place you need it. *And* that way you wouldn't need the above function > at all. Honestly: An array of {type,size,cache_id} VS A list of: "--type" "type" "--size" size "--cache-id" id Is essentially the same thing to me. And you would have to convert back to the "type=type,size=size,cache-id=cacheid" format anyway. So i don't see any actual advantage of your suggestion (other than personal preference). But sure, i'll convert it to an array of {type,size,cache_id}. > > Index: systemd/src/core/execute.c > > === &
[systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)
Cache Allocation Technology is a feature on selected recent Intel Xeon processors which allows control over L3 cache allocation. Kernel support has been merged to the upstream kernel, via a filesystem resctrlfs. On top of that, a userspace utility, resctrltool has been written to facilitate writing applications and using the filesystem interface (see the rationale at http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html). This patch adds a new option to systemd, RDTCacheReservation, to allow configuration of CAT via resctrltool. See the first hunk of the patch for a description of the option. Signed-off-by: Marcelo Tosatti Index: systemd/man/systemd.exec.xml === --- systemd.orig/man/systemd.exec.xml +++ systemd/man/systemd.exec.xml @@ -233,6 +233,31 @@ +RDTCacheReservation= + +Creates a L3 CAT reservation in resctrlfs +for the executed processes. Takes a ";" separated list of +tuples (optionally triples) containing type,size and +optionally cacheid: +type=type,size=size,cacheid=id; +type=type,size=size,cacheid=id;... + +Where: + +type is one of: both, data, code. +size is size in kbytes. +cacheid (optional) is the cacheid for +the reservation. + +Rules for the parameters: +* type=code must be paired with type=data entry. + +See the output of resctrltool for more details. + + + + + IOSchedulingClass= Sets the I/O scheduling class for executed Index: systemd/src/basic/exit-status.c === --- systemd.orig/src/basic/exit-status.c +++ systemd/src/basic/exit-status.c @@ -62,6 +62,9 @@ const char* exit_status_to_string(int st case EXIT_OOM_ADJUST: return "OOM_ADJUST"; +case EXIT_RDT_CACHE_RESERVATION: +return "RDT_CACHE_RESERVATION"; + case EXIT_SIGNAL_MASK: return "SIGNAL_MASK"; Index: systemd/src/basic/exit-status.h === --- systemd.orig/src/basic/exit-status.h +++ systemd/src/basic/exit-status.h @@ -83,6 +83,7 @@ enum { EXIT_CHOWN, EXIT_SMACK_PROCESS_LABEL, EXIT_KEYRING, +EXIT_RDT_CACHE_RESERVATION, }; typedef enum ExitStatusLevel { Index: systemd/src/core/dbus-execute.c === --- systemd.orig/src/core/dbus-execute.c +++ systemd/src/core/dbus-execute.c @@ -119,6 +119,36 @@ static int property_get_oom_score_adjust return sd_bus_message_append(reply, "i", n); } +static int property_get_rdt_cache_reservation( +sd_bus *bus, +const char *path, +const char *interface, +const char *property, +sd_bus_message *reply, +void *userdata, +sd_bus_error *error) { + +ExecContext *c = userdata; +int r; +int i; + +assert(bus); +assert(reply); +assert(c); + +r = sd_bus_message_open_container(reply, 'a', "s"); +if (r < 0) +return r; + +for (i = 0; i < c->argmidx; i++) { +r = sd_bus_message_append(reply, "s", c->argm[i]); +if (r < 0) +return r; +} + +return sd_bus_message_close_container(reply); +} + static int property_get_nice( sd_bus *bus, const char *path, @@ -759,6 +789,7 @@ const sd_bus_vtable bus_exec_vtable[] = SD_BUS_PROPERTY("WorkingDirectory", "s", property_get_working_directory, 0, SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("RootDirectory", "s", NULL, offsetof(ExecContext, root_directory), SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("OOMScoreAdjust", "i", property_get_oom_score_adjust, 0, SD_BUS_VTABLE_PROPERTY_CONST), +SD_BUS_PROPERTY("RDTCacheReservation", "as", property_get_rdt_cache_reservation, 0, SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("Nice", "i", property_get_nice, 0, SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("IOScheduling", "i", property_get_ioprio, 0, SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("CPUSchedulingPolicy", "i", property_get_cpu_sched_policy, 0, SD_BUS_VTABLE_PROPERTY_CONST), @@ -921,6 +952,90 @@ int bus_property_get_exec_command_list( return sd_bus_message_close_container(reply); } +static char*