Re: [systemd-devel] [PATCH] udev rules: add udev rule to create /dev/ptp_kvm

2017-02-28 Thread Marcelo Tosatti
On Sun, Feb 26, 2017 at 09:52:18PM +0100, Lennart Poettering wrote:
> On Thu, 23.02.17 22:20, Marcelo Tosatti (mtosa...@redhat.com) wrote:
> 
> > 
> > Its necessary to specify the KVM PTP device name in userspace.
> > 
> > In case a network card with PTP device is assigned to the guest,
> > it might be the case that KVM PTP gets /dev/ptp0 instead of /dev/ptp1. 
> > 
> > Fix a device name for the KVM PTP device.
> 
> What's the symlink precisely good for, can you elaborate? 

You want to configure Chrony to use PTP in the guest to sync with the
host.

You need to add a entry to /etc/chrony.conf pointing to "/dev/ptp0", 
the ptp_kvm device.

However, it might be the case that a PCI assigned device has a PTP
clock, and it can be registered as "/dev/ptp0" and ptp_kvm as
"/dev/ptp1".

> Also, what's the benefit of shipping this upstream? Why not ship that
> rule with kvm?

qemu-kvm package? Sure i can do that, but then all distributions 
have to do the same with their own packages.

Why not do it in a centralized location where everyone benefits?
(other virtual devices such as virtio-serial already have aliases
upstream).

> Lennart
> 
> -- 
> Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] udev rules: add udev rule to create /dev/ptp_kvm

2017-02-23 Thread Marcelo Tosatti

Its necessary to specify the KVM PTP device name in userspace.

In case a network card with PTP device is assigned to the guest,
it might be the case that KVM PTP gets /dev/ptp0 instead of /dev/ptp1. 

Fix a device name for the KVM PTP device.

Signed-off-by: Marcelo Tosatti 

diff --git a/rules/50-udev-default.rules b/rules/50-udev-default.rules
index e9eeb85..3347c8c 100644
--- a/rules/50-udev-default.rules
+++ b/rules/50-udev-default.rules
@@ -74,4 +74,6 @@ KERNEL=="tun", MODE="0666", OPTIONS+="static_node=net/tun"
 
 KERNEL=="fuse", MODE="0666", OPTIONS+="static_node=fuse"
 
+SUBSYSTEM=="ptp", ATTR{clock_name}=="KVM virtual PTP", SYMLINK += "ptp_kvm"
+
 LABEL="default_end"
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)

2017-01-09 Thread Marcelo Tosatti
On Fri, Jan 06, 2017 at 07:51:05PM +0100, Lennart Poettering wrote:
> On Fri, 06.01.17 11:59, Marcelo Tosatti (mtosa...@redhat.com) wrote:
> 
> > Cache Allocation Technology is a feature on selected recent Intel Xeon
> > processors which allows control over L3 cache allocation.
> 
> What precisely is the benefit of making this configurable? Can you
> describe a basic usecase why I'd want to set this on some service?

http://www.intel.com/content/www/us/en/communications/cache-allocation-technology-white-paper.html

"On systems with multiple workloads running simultaneously, which
heavily utilize the cache; system administrators can utilize CAT to
improve performance. CAT can be used to better utilize the caches and
as a result some applications are more likely to hit the cache which
results in fewer accesses to system memory. This results in lower
latency and greater performance."



For example, with a job with large memory reaccess distances (where
caching hardly brings any benefit), you can limit that job to reclaim a
small portion of L3 cache (so you don't throw away data from other jobs
from L3 cache).

> 
> > Index: systemd/man/systemd.exec.xml
> > ===
> > --- systemd.orig/man/systemd.exec.xml
> > +++ systemd/man/systemd.exec.xml
> 
> Hmm, this doesn't look like a git patch? We generally prefer
> submissions as github PRs these days, btw.
> 
> > @@ -233,6 +233,31 @@
> >
> >  
> >
> > +RDTCacheReservation=
> > +
> > +Creates a L3 CAT reservation in resctrlfs
> > +for the executed processes. Takes a ";" separated list of
> > +tuples (optionally triples) containing type,size and
> > +optionally cacheid:
> 
> if I read this, then I am puzzled what "RDT Cache" is, what "L3 CAT"
> is. Can you explain?

RDT: Resource Director Technology(RDT)
CAT: Cache Allocation Technology (CAT), a subcomponent of RDT 
(which includes other things such as CMT: Cache Monitoring Technology).

> 
> > +type=type,size=size,cacheid=id;
> > +type=type,size=size,cacheid=id;...
> > +
> > +Where:
> > +
> > +type is one of: both, data, code.
> > +size is size in kbytes.
> > +cacheid (optional) is the cacheid for
> > +the reservation.
> 
> what is a "cacheid"?

Cache IDs
-
On current generation systems there is one L3 cache per socket and L2
caches are generally just shared by the hyperthreads on a core, but this
isn't an architectural requirement. We could have multiple separate L3
caches on a socket, multiple cores could share an L2 cache. So instead
of using "socket" or "core" to define the set of logical cpus sharing
a resource we use a "Cache ID". At a given cache level this will be a
unique number across the whole system (but it isn't guaranteed to be a
contiguous sequence, there may be gaps).  To find the ID for each
logical
CPU look in /sys/devices/system/cpu/cpu*/cache/index*/id


> 
> > +
> > +Rules for the parameters:
> > +* type=code must be paired with type=data entry.
> > +
> > +See the output of resctrltool for more details.
> > +
> > +
> > +  
> 
> This syntax doesn't appear to be in line with how we'd do this in
> systemd usually. Ideally, we try to hide the fact that our settings
> syntax is one-dimensional as much as we can. Hence, a syntax like this
> sounds more appropriate to me:
> 
>RDTCacheReservation=4K code
>RDTCacheReservation=16K data
> 
> and so on...

What you suggest is to essentially to split

RDTCacheReservation=type=type,size=size,cacheid=id;type=type,size=size,cacheid=id

Into

RDTCacheReservation=type=type,size=size,cacheid=id
RDTCacheReservation=type=type,size=size,cacheid=id

(yes sure can get rid of the commas as well, looks better).

> But then again, I have no idea what all of this actually really is,
> hence I am not sure my proposed syntax makes much sense.
> 
> sizes are generally parsed with parse_bytes() in systemd, so that K,
> M, G suffixes are properly recognized...
> 
> > +ExecContext *c = userdata;
> > +int r;
> > +int i;
> > +
> > +assert(bus);
> > +assert(reply);
> > +assert(c);
> > +
> > +r = sd_bus_message_open_container(reply, 'a', "s");
> > +if (r < 0)
> > +return r;
> > +
> > +for (i = 0; i < c->argmidx; i+

Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)

2017-01-09 Thread Marcelo Tosatti
GOn Fri, Jan 06, 2017 at 07:54:48PM +0100, Lennart Poettering wrote:
> On Fri, 06.01.17 13:51, Marcelo Tosatti (mtosa...@redhat.com) wrote:
> 
> > > This really doesn't look pretty, neither the approach nor the
> > > implementation...
> > 
> > Suggestions to improve the code or the approach are welcome.
> > 
> > > Is the option actually so complex that calling resctrltool is the only way
> > > to adjust it? What about writing to the resctrlfs directly?
> > 
> > You'll have to deal with the issues that resctrltool deals with,
> > namely:
> > 
> > 1) Filesystem locking.
> > 2) Reading in every directory and the default 
> > directory.
> > 3) Converting the reservation request to proper sizes.
> > 4) Converting:
> > type=both --> type=data/type=code
> > 
> > type=data/type=code --> type=both
> > 
> > 4) Finding free space for the reservation.
> > 5) Adjusting the default group reservation.
> > 
> > Since this steps must be performed by every user of
> > CAT (including libvirt which plans to execute resctrltool
> > as well), it was decided its better to maintain this logic
> > in a centralized place.
> 
> Centralized place is fine, but please do this via a proper C library,
> not some tool to shell out...
> 
> Hmm, if I see that correctly, then resctrltool is a python script? I
> am particularly concerned about making systemd indirectly depend on
> Python this way, in particular for something that isn't precisely a
> core functionality of systemd.

Yes its a Python script. Sure, will write a C library for
resctrltool-0.2 and repost the patches.

Thanks.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)

2017-01-06 Thread Marcelo Tosatti
On Fri, Jan 06, 2017 at 01:51:11PM -0200, Marcelo Tosatti wrote:
> On Fri, Jan 06, 2017 at 05:26:36PM +0200, Mantas Mikulėnas wrote:
> > On Fri, Jan 6, 2017 at 3:59 PM, Marcelo Tosatti  wrote:
> > 
> > >
> > >
> > > Cache Allocation Technology is a feature on selected recent Intel Xeon
> > > processors which allows control over L3 cache allocation.
> > >
> > > Kernel support has been merged to the upstream kernel, via a filesystem
> > > resctrlfs.
> > >
> > > On top of that, a userspace utility, resctrltool has been written
> > > to facilitate writing applications and using the filesystem
> > > interface (see the rationale at
> > > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html).
> > >
> > > This patch adds a new option to systemd, RDTCacheReservation,
> > > to allow configuration of CAT via resctrltool.
> > >
> > > See the first hunk of the patch for a description of the option
> > 
> > 
> > This really doesn't look pretty, neither the approach nor the
> > implementation...
> 
> Suggestions to improve the code or the approach are welcome.
> 
> > Is the option actually so complex that calling resctrltool is the only way
> > to adjust it? What about writing to the resctrlfs directly?
> 
> You'll have to deal with the issues that resctrltool deals with,
> namely:
> 
> 1) Filesystem locking.
> 2) Reading in every directory and the default 
> directory.
> 3) Converting the reservation request to proper sizes.
> 4) Converting:
>   type=both --> type=data/type=code
> 
>   type=data/type=code --> type=both
> 
> 4) Finding free space for the reservation.
> 5) Adjusting the default group reservation.
> 
> Since this steps must be performed by every user of
> CAT (including libvirt which plans to execute resctrltool
> as well), it was decided its better to maintain this logic
> in a centralized place.
> 
> Why do you prefer writing to to resctrlfs directly?
> 
> > Also, it would be nicer to submit the patches as GitHub pull requests
> > instead.
> 
> Sure can do that after the reviews... Thanks for your comments!
> 
> > 
> > >
> > > +static char* convert_rdt_back(char *argm[], int argmidx)
> > > +{
> > > +   int i, ret;
> > > +   char *buf, *str;
> > > +   int size = 0;
> > > +   int printcomma = 0;
> > > +   int printsemicolon = 0;
> > > +   int localmode = 0;
> > > +
> > > +   for (i=0; i < argmidx; i++) {
> > > +   /* ',', ';', '=' */
> > > +   size += strlen(argm[i]) + 3;
> > > +   }
> > > +
> > > +   buf = malloc(size);
> > > +   if (buf == NULL)
> > > +   return NULL;
> > > +   memset(buf, 0, size);
> > > +
> > > +   str = buf;
> > > +
> > > +   /* cache-id specified */
> > > +   for (i=0; i < argmidx; i++) {
> > > +   if (strlen(argm[i]) == 10) {
> > > +   if (strcmp(argm[i], "--cache-id") == 0)
> > > +   localmode = 1;
> > > +   }
> > > +   }
> > > +
> > > +   for (i=0; i > > +   char *cur = argm[i];
> > > +
> > > +   if (strlen(cur) == 0)
> > > +   return buf;
> > > +
> > > +   if (strlen(cur) == 6) {
> > > +   if (strcmp(cur, "--type") == 0) {
> > > +   ret = sprintf(str, "type=");
> > > +   str = str + ret;
> > > +   printcomma = 1;
> > > +   continue;
> > > +   }
> > > + etc.
> > > +   }
> > > +
> > > +   return buf;
> > > +}
> > > +
> > >
> > 
> > Converting from foo;bar to argv[] and back is a bit horrible.
> 
> Its a very simple function.
> 
> >  It seems to
> > me that it would be *much* simpler and easier to maintain if you stored
> > only actual parameters in the unit (e.g. as an array of
> > {type,size,cache_id}) and only generated argv at the moment it was
> > necessary, i.e. right before spawning resctrltool. After all, that's the
> > only place you need it. *And* that way you w

Re: [systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)

2017-01-06 Thread Marcelo Tosatti
On Fri, Jan 06, 2017 at 05:26:36PM +0200, Mantas Mikulėnas wrote:
> On Fri, Jan 6, 2017 at 3:59 PM, Marcelo Tosatti  wrote:
> 
> >
> >
> > Cache Allocation Technology is a feature on selected recent Intel Xeon
> > processors which allows control over L3 cache allocation.
> >
> > Kernel support has been merged to the upstream kernel, via a filesystem
> > resctrlfs.
> >
> > On top of that, a userspace utility, resctrltool has been written
> > to facilitate writing applications and using the filesystem
> > interface (see the rationale at
> > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html).
> >
> > This patch adds a new option to systemd, RDTCacheReservation,
> > to allow configuration of CAT via resctrltool.
> >
> > See the first hunk of the patch for a description of the option
> 
> 
> This really doesn't look pretty, neither the approach nor the
> implementation...

Suggestions to improve the code or the approach are welcome.

> Is the option actually so complex that calling resctrltool is the only way
> to adjust it? What about writing to the resctrlfs directly?

You'll have to deal with the issues that resctrltool deals with,
namely:

1) Filesystem locking.
2) Reading in every directory and the default 
directory.
3) Converting the reservation request to proper sizes.
4) Converting:
type=both --> type=data/type=code

type=data/type=code --> type=both

4) Finding free space for the reservation.
5) Adjusting the default group reservation.

Since this steps must be performed by every user of
CAT (including libvirt which plans to execute resctrltool
as well), it was decided its better to maintain this logic
in a centralized place.

Why do you prefer writing to to resctrlfs directly?

> Also, it would be nicer to submit the patches as GitHub pull requests
> instead.

Sure can do that after the reviews... Thanks for your comments!

> 
> >
> > +static char* convert_rdt_back(char *argm[], int argmidx)
> > +{
> > +   int i, ret;
> > +   char *buf, *str;
> > +   int size = 0;
> > +   int printcomma = 0;
> > +   int printsemicolon = 0;
> > +   int localmode = 0;
> > +
> > +   for (i=0; i < argmidx; i++) {
> > +   /* ',', ';', '=' */
> > +   size += strlen(argm[i]) + 3;
> > +   }
> > +
> > +   buf = malloc(size);
> > +   if (buf == NULL)
> > +   return NULL;
> > +   memset(buf, 0, size);
> > +
> > +   str = buf;
> > +
> > +   /* cache-id specified */
> > +   for (i=0; i < argmidx; i++) {
> > +   if (strlen(argm[i]) == 10) {
> > +   if (strcmp(argm[i], "--cache-id") == 0)
> > +   localmode = 1;
> > +   }
> > +   }
> > +
> > +   for (i=0; i > +   char *cur = argm[i];
> > +
> > +   if (strlen(cur) == 0)
> > +   return buf;
> > +
> > +   if (strlen(cur) == 6) {
> > +   if (strcmp(cur, "--type") == 0) {
> > +   ret = sprintf(str, "type=");
> > +   str = str + ret;
> > +   printcomma = 1;
> > +   continue;
> > +   }
> > + etc.
> > +   }
> > +
> > +   return buf;
> > +}
> > +
> >
> 
> Converting from foo;bar to argv[] and back is a bit horrible.

Its a very simple function.

>  It seems to
> me that it would be *much* simpler and easier to maintain if you stored
> only actual parameters in the unit (e.g. as an array of
> {type,size,cache_id}) and only generated argv at the moment it was
> necessary, i.e. right before spawning resctrltool. After all, that's the
> only place you need it. *And* that way you wouldn't need the above function
> at all.

Honestly:

An array of {type,size,cache_id}

VS

A list of:

"--type" "type" "--size" size "--cache-id" id

Is essentially the same thing to me.

And you would have to convert back to the

"type=type,size=size,cache-id=cacheid" 

format anyway.

So i don't see any actual advantage of your suggestion
(other than personal preference).

But sure, i'll convert it to an array of 

{type,size,cache_id}.

> > Index: systemd/src/core/execute.c
> > ===
&

[systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)

2017-01-06 Thread Marcelo Tosatti


Cache Allocation Technology is a feature on selected recent Intel Xeon
processors which allows control over L3 cache allocation.

Kernel support has been merged to the upstream kernel, via a filesystem
resctrlfs.

On top of that, a userspace utility, resctrltool has been written 
to facilitate writing applications and using the filesystem
interface (see the rationale at 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html).

This patch adds a new option to systemd, RDTCacheReservation,
to allow configuration of CAT via resctrltool.

See the first hunk of the patch for a description of the option.


Signed-off-by: Marcelo Tosatti 


Index: systemd/man/systemd.exec.xml
===
--- systemd.orig/man/systemd.exec.xml
+++ systemd/man/systemd.exec.xml
@@ -233,6 +233,31 @@
   
 
   
+RDTCacheReservation=
+
+Creates a L3 CAT reservation in resctrlfs
+for the executed processes. Takes a ";" separated list of
+tuples (optionally triples) containing type,size and
+optionally cacheid:
+type=type,size=size,cacheid=id;
+type=type,size=size,cacheid=id;...
+
+Where:
+
+type is one of: both, data, code.
+size is size in kbytes.
+cacheid (optional) is the cacheid for
+the reservation.
+
+Rules for the parameters:
+* type=code must be paired with type=data entry.
+
+See the output of resctrltool for more details.
+
+
+  
+
+  
 IOSchedulingClass=
 
 Sets the I/O scheduling class for executed
Index: systemd/src/basic/exit-status.c
===
--- systemd.orig/src/basic/exit-status.c
+++ systemd/src/basic/exit-status.c
@@ -62,6 +62,9 @@ const char* exit_status_to_string(int st
 case EXIT_OOM_ADJUST:
 return "OOM_ADJUST";
 
+case EXIT_RDT_CACHE_RESERVATION:
+return "RDT_CACHE_RESERVATION";
+
 case EXIT_SIGNAL_MASK:
 return "SIGNAL_MASK";
 
Index: systemd/src/basic/exit-status.h
===
--- systemd.orig/src/basic/exit-status.h
+++ systemd/src/basic/exit-status.h
@@ -83,6 +83,7 @@ enum {
 EXIT_CHOWN,
 EXIT_SMACK_PROCESS_LABEL,
 EXIT_KEYRING,
+EXIT_RDT_CACHE_RESERVATION,
 };
 
 typedef enum ExitStatusLevel {
Index: systemd/src/core/dbus-execute.c
===
--- systemd.orig/src/core/dbus-execute.c
+++ systemd/src/core/dbus-execute.c
@@ -119,6 +119,36 @@ static int property_get_oom_score_adjust
 return sd_bus_message_append(reply, "i", n);
 }
 
+static int property_get_rdt_cache_reservation(
+sd_bus *bus,
+const char *path,
+const char *interface,
+const char *property,
+sd_bus_message *reply,
+void *userdata,
+sd_bus_error *error) {
+
+ExecContext *c = userdata;
+int r;
+int i;
+
+assert(bus);
+assert(reply);
+assert(c);
+
+r = sd_bus_message_open_container(reply, 'a', "s");
+if (r < 0)
+return r;
+
+for (i = 0; i < c->argmidx; i++) {
+r = sd_bus_message_append(reply, "s", c->argm[i]);
+if (r < 0)
+return r;
+}
+
+return sd_bus_message_close_container(reply);
+}
+
 static int property_get_nice(
 sd_bus *bus,
 const char *path,
@@ -759,6 +789,7 @@ const sd_bus_vtable bus_exec_vtable[] =
 SD_BUS_PROPERTY("WorkingDirectory", "s", 
property_get_working_directory, 0, SD_BUS_VTABLE_PROPERTY_CONST),
 SD_BUS_PROPERTY("RootDirectory", "s", NULL, offsetof(ExecContext, 
root_directory), SD_BUS_VTABLE_PROPERTY_CONST),
 SD_BUS_PROPERTY("OOMScoreAdjust", "i", property_get_oom_score_adjust, 
0, SD_BUS_VTABLE_PROPERTY_CONST),
+SD_BUS_PROPERTY("RDTCacheReservation", "as", 
property_get_rdt_cache_reservation, 0, SD_BUS_VTABLE_PROPERTY_CONST),
 SD_BUS_PROPERTY("Nice", "i", property_get_nice, 0, 
SD_BUS_VTABLE_PROPERTY_CONST),
 SD_BUS_PROPERTY("IOScheduling", "i", property_get_ioprio, 0, 
SD_BUS_VTABLE_PROPERTY_CONST),
 SD_BUS_PROPERTY("CPUSchedulingPolicy", "i", 
property_get_cpu_sched_policy, 0, SD_BUS_VTABLE_PROPERTY_CONST),
@@ -921,6 +952,90 @@ int bus_property_get_exec_command_list(
 return sd_bus_message_close_container(reply);
 }
 
+static char*