Re: [systemd-devel] [PATCH] cgroup: After MemmoryAccounting=yes running scope has no memusage

2014-04-09 Thread David Timothy Strauss
+1 from me. Seems like a good bugfix.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Move use of locale_t to different shared file, so that udev can still be built without it (MinimalBuild)

2014-04-09 Thread Samuli Suominen

On 08/04/14 19:16, Cristian Rodríguez wrote:
 El 08/04/14 03:04, Samuli Suominen escribió:

 This is the *only* patch we are carrying for udev currently, otherwise
 uClibc builds work fine, so please at least consider what
 I just said.

 All this locale_t thing is standarized in POSIX 2008, . it is up to
 the particular libc to keep up with standards.

 systemd requires glibc, this is known and clear since the beginning.

systemd-udevd doesn't, albeit their is secure_getenv but it's currently
used only for logging, and note that there is still sysv support in the
source tree

 ___
 systemd-devel mailing list
 systemd-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/systemd-devel

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Netconsole NG

2014-04-09 Thread poma
On 08.04.2014 23:27, poma wrote:
 On 08.04.2014 14:25, Tom Gundersen wrote:
 On Tue, Apr 8, 2014 at 2:10 PM, poma pomidorabelis...@gmail.com wrote:
 On 08.04.2014 04:03, poma wrote:
 On 07.04.2014 19:55, Zbigniew Jędrzejewski-Szmek wrote:
 On Mon, Apr 07, 2014 at 05:34:10PM +0200, Lukáš Nykrýn wrote:
 The reason why this was not rewritten a long time ago is that the
 initscript tries to figure some of those values by itself (for
 example the MAC address). But yes, we need to do something with
 netconsole. It is a blocker for my initscripts evil plan.
 Doesn't netconsole figure out most of those settings by itself, and
 others default to sensible values, so if the network card is up
 and has an address configured, only the device and target ip must be 
 given?

 Dne 6.4.2014 17:59, poma napsal(a):

 /etc/sysconfig/netconsole:
 # This is the EnvironmentFile for the netconsole service. Starting this
 # service enables the capture of dmesg output on a destination machine.

 # Source port
 SRC_PORT=12345
 This should default to empty... Kernel will pick something.


 # Source IP address
 SRC_IP=192.168.1.2
 This should default to empty... Kernel will use configured address,
 since we order after network.target anyway.

 # Source network device
 SRC_DEV=enp1s2
 Maybe this can be made into a instance argument?

 # Destination port
 DST_PORT=12345
 I think this should default to 514/syslog, and can be left unset.

 # Destination IP address
 DST_IP=192.168.1.1

 # Destination ethernet address
 DST_EHA=00:11:22:33:44:55
 This should default to unset. The kernel will query it if not set.

 /usr/lib/systemd/system/netconsole.service:
 [Unit]
 Description=Adds the netconsole module with the configured parameters
 After=network.target

 [Service]
 EnvironmentFile=/etc/sysconfig/netconsole
 This is Fedora/RH specific. But I don't know what the proper path should
 be, so maybe this is OK for now.

 Type=simple
 This is the default... No need to specify.

 RemainAfterExit=yes
 ExecStart=/usr/sbin/modprobe netconsole
 This should be /sbin/modprobe for compatibility with split root.

 netconsole=${SRC_PORT}@${SRC_IP}/${SRC_DEV},${DST_PORT}@${DST_IP}/${DST_EHA}
 ExecStop=/usr/sbin/modprobe -r netconsole
 Ditto.


 [Install]
 WantedBy=multi-user.target
 That's a really late... But I don't see a better place unfortunately.

 The original SysV netconsole service with related config - still in use,
 https://git.fedorahosted.org/cgit/initscripts.git/plain/rc.d/init.d/netconsole
 https://git.fedorahosted.org/cgit/initscripts.git/plain/sysconfig/netconsole

 Feel free to comment.
 Looks like an improvement on status quo.

 Zbyszek


 Shall we still leave something for users to configure.
 Thanks for your review.


 $ cat /usr/lib/systemd/system/netconsole-zbyszek.service
 [Unit]
 Description=Adds the netconsole module with the configured parameters
 After=network.target

 [Service]
 EnvironmentFile=/etc/sysconfig/netconsole-zbyszek
 RemainAfterExit=yes
 ExecStart=/sbin/modprobe netconsole netconsole=@/${SRC_DEV},@${DST_IP}/
 ExecStop=/sbin/modprobe -r netconsole

 [Install]
 WantedBy=multi-user.target

 ~~~

 $ cat /etc/sysconfig/netconsole-zbyszek
 # This is the EnvironmentFile for the netconsole service. Starting this
 # service enables the capture of dmesg output on a destination machine.

 # Source network device
 SRC_DEV=enp1s2

 # Destination IP address
 DST_IP=192.168.1.1

 ~~~

 $ dmesg | grep netcon
 [   24.361611] netpoll: netconsole: local port 6665
 [   24.361764] netpoll: netconsole: local IPv4 address 0.0.0.0
 [   24.361893] netpoll: netconsole: interface 'enp1s2'
 [   24.362061] netpoll: netconsole: remote port 
 [   24.362344] netpoll: netconsole: remote IPv4 address 192.168.1.1
 [   24.362635] netpoll: netconsole: remote ethernet address
 ff:ff:ff:ff:ff:ff
 [   24.362909] netpoll: netconsole: no IP address for enp1s2, aborting
 [   24.363186] netconsole: cleaning up

 ~
 This turns out to be a bare minimum, i.e.
 # modprobe netconsole netconsole=@/enp1s2,@192.168.1.1/
 but that is also squeeze breeze ...
 ~

 $ cat /usr/lib/systemd/system/netconsole-poma.service
 [Unit]
 Description=Adds the netconsole module with the configured parameters
 After=network.target

 [Service]
 EnvironmentFile=/etc/sysconfig/netconsole-poma
 RemainAfterExit=yes
 ExecStart=/sbin/modprobe netconsole
 netconsole=@${SRC_IP}/${SRC_DEV},@${DST_IP}/
 ExecStop=/sbin/modprobe -r netconsole

 [Install]
 WantedBy=multi-user.target

 

 $ cat /etc/sysconfig/netconsole-poma
 # This is the EnvironmentFile for the netconsole service. Starting this
 # service enables the capture of dmesg output on a destination machine.

 # Source IP address
 SRC_IP=192.168.1.2

 # Source network 

Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts

2014-04-09 Thread Karel Zak
On Fri, Apr 04, 2014 at 05:30:03PM -0400, Vivek Goyal wrote:
 What happens if nofail is specified and device is present and there
 are file system errors. Will fsck continue with boot or drop user into
 a shell during boot and force to fix file system failures?

fsck cares about nofail option only if the device does not exist --
it's evaluated before FS check. 

Note that fsck(8) itself does not check filesystems, and fsck.type 
helpers does not have a clue about nofail at all.

Karel

-- 
 Karel Zak  k...@redhat.com
 http://karelzak.blogspot.com
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] Document CONFIG_NET_NS as a required kernel option

2014-04-09 Thread Tom Gundersen
On Mon, Mar 31, 2014 at 8:28 PM, Mike Gilbert flop...@gentoo.org wrote:
 Several units now utilize the PrivateNetwork parameter, which requires
 network namespace support.

Applied. Thanks!

Cheers,

Tom
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts

2014-04-09 Thread WANG Chao
On 04/08/14 at 06:02pm, Vivek Goyal wrote:
 On Tue, Apr 08, 2014 at 02:14:33AM +0200, Zbigniew Jędrzejewski-Szmek wrote:
 
 [..]
 Defining a new target which by default waits for all the local fs 
 target
 sounds interesting. Again, I have the question, what will happen to 
 local-fs-all.target if some device does not show up and say one of the
 mounts specified in /etc/fstab fails.
  It result is different for Requires= and for Wants=. Iff there's a chain
  of Requires= from the failing unit (.device in this case) to the target unit
  it will fail. Otherwise, it'll just be delayed. If, as I suggested above 
  local-fs-all.target
  would have Requires= on the .mount units, then your unit could still have
  Wants=/After=local-fs-all.target, and it'll be started even if some mounts
  fail.
 
 Thanks now I understand the difference between Requires= and Wants=
 better.
 
  
 What we want is.
 
 - Wait for all devices to show up as specified in /etc/fstab. Run fsck
   on devices. Mount devices to mount points specified.
 
 - If everything is successful, things are fine and local-fs-all.target
   will be reached.
 
 - If some device does not show up, or if fsck fails or mount fails, 
 still
   local-fs-all.target should reach so that kdump module can detect 
 that
   failure happened and can take alternative action.
  Alternatively, you can specify a soft depenendency on local-fs-all.target by
  using Wants=local-fs-all.target. I think this is preferable, because we want
  local-fs-all.target to be as similar as possible to local-fs.target, which
  has Requires= on the mount points.
  
  With this caveat, this should all be satisfied with my proposal.
 
 Agreed. We could define Wants=local-fs-all.target and that would make
 sure that our unit will be started even if local-fs-all.target fails.
 
  
You can use OnFailure= to define unit(s) started when
local-fs-all.target fails. But it sounds like you are not really
interested in *all* filesystems, but in specific fileststems defined in
kdump configuration.
   
   Kdump scripts registers with dracut as pre-pivot hook. And I believe
   that in initramfs environments /etc/fstab does not contain all
   filesystems. It prmarily contains root and any file system specified
   on dracut command line using --mount option during initramfs generation.
   
   So my understanding that given the fact that /etc/fstab is minimal in
   initramfs, we should be fine waiting for all the fs specified. 
   
   Given the fact that we run under dracut pre-pivot hook callback, I think
   dracut-pre-pivot.service wil have to create a dependency to run after
   local-fs-all.target is reached.
  Hm, maybe. It would be good to get some input from Harald here.
  This is pretty specialized, so maybe it'd be better to have a separate unit
  positioned before or after or parallel to dracut-pre-pivot.service.
 
 I am just thinking loud now. Taking a step back and going back to
 figure out why did we introduce nofail to begin with.
 
 If I go through kexec-tools logs, it says nofail was introduced
 otherwise we never reach initrd.target. I am wondering why that's the
 case. Current initrd.target seems to have following.
 
 [Unit]
 Description=Initrd Target
 Requires=basic.target
 Conflicts=rescue.service rescue.target
 After=basic.target rescue.service rescue.target
 AllowIsolate=yes
 OnFailure=emergency.target
 OnFailureIsolate=yes
 ConditionPathExists=/etc/initrd-release

dracut doesn't use this initrd.target. It uses the stock one from
systemd:

[Unit]
Description=Initrd Default Target
Documentation=man:systemd.special(7)
OnFailure=emergency.target
OnFailureIsolate=yes
ConditionPathExists=/etc/initrd-release
Requires=basic.target
Wants=initrd-root-fs.target initrd-fs.target initrd-parse-etc.service
After=initrd-root-fs.target initrd-fs.target basic.target
rescue.service rescue.target
AllowIsolate=yes

In sysroot.mount context, if we don't use nofail in case of root disk
failure, we will never reach initrd-root-fs.target and hence we never
reach initrd.target and dracut-pre-povit.service never get a chance to
start.

 
 So it Requires=basic.target. Now let us say basic.target fails, then
 I am assuming emergency.target will be activated. And if we hook into
 emergency-shell binary and make it run a registered error handler if
 it is available, then kdump can drop its handler and take action on
 failure.
 
 IOW, what if we stop passing nofail. Then local-fs.target practically
 becomes local-fs-all.target. Either services will start just fine (after
 a wait for deivces to show up). Or units will start failing and if boot
 can't cointinue then somewhere we will fall into emergency shell and
 then emergency shell will call into kdump handler.
 
 This is assuming that we have designed boot path in such a way that
 most of the time we will not do infinite wait (until and unless user
 asked us to do to. 

[systemd-devel] [KDBUS PATCH] remove unused variable

2014-04-09 Thread Daniel Buch
---
 connection.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/connection.c b/connection.c
index 2d69f17..5e7d553 100644
--- a/connection.c
+++ b/connection.c
@@ -1024,7 +1024,7 @@ int kdbus_cmd_msg_recv(struct kdbus_conn *conn,
 
/* just drop the message */
if (recv-flags  KDBUS_RECV_DROP) {
-   struct kdbus_conn_reply *r, *reply = NULL;
+   struct kdbus_conn_reply *reply = NULL;
bool reply_found = false;
 
if (queue-reply) {
-- 
1.9.1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [KDBUS PATCH] remove unused variable

2014-04-09 Thread Daniel Mack
On 04/09/2014 11:43 AM, Daniel Buch wrote:
 ---
  connection.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/connection.c b/connection.c
 index 2d69f17..5e7d553 100644
 --- a/connection.c
 +++ b/connection.c
 @@ -1024,7 +1024,7 @@ int kdbus_cmd_msg_recv(struct kdbus_conn *conn,
  
   /* just drop the message */
   if (recv-flags  KDBUS_RECV_DROP) {
 - struct kdbus_conn_reply *r, *reply = NULL;
 + struct kdbus_conn_reply *reply = NULL;
   bool reply_found = false;
  
   if (queue-reply) {
 

Oops, I just realized that I forgot to push my own version of that patch
which I had locally since some days. Did that right now.

Sorry, but thanks for your submission!


Daniel

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [KDBUS PATCH] remove unused variable

2014-04-09 Thread Daniel Buch
No problem :)


2014-04-09 11:46 GMT+02:00 Daniel Mack dan...@zonque.org:

 On 04/09/2014 11:43 AM, Daniel Buch wrote:
  ---
   connection.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/connection.c b/connection.c
  index 2d69f17..5e7d553 100644
  --- a/connection.c
  +++ b/connection.c
  @@ -1024,7 +1024,7 @@ int kdbus_cmd_msg_recv(struct kdbus_conn *conn,
 
/* just drop the message */
if (recv-flags  KDBUS_RECV_DROP) {
  - struct kdbus_conn_reply *r, *reply = NULL;
  + struct kdbus_conn_reply *reply = NULL;
bool reply_found = false;
 
if (queue-reply) {
 

 Oops, I just realized that I forgot to push my own version of that patch
 which I had locally since some days. Did that right now.

 Sorry, but thanks for your submission!


 Daniel


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] metadata: reflect change in task_cgroup_name

2014-04-09 Thread Hristo Venev
Change: e61734c55c24cdf11b07e52a74aec4dc4a7f4bd0.
Merged: dc5ed40686a4da95881c35d913b60f867755cbe2 in 3.15-rc1.

task_cgroup_name returns a pointer to the path or NULL if there is not
enough space in the buffer (used to return nonnegative or -ENAMETOOLONG).

On systemd systems fixes a kernel panic about init dying while opening a
bus. Now it boots properly.
---
 metadata.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/metadata.c b/metadata.c
index 5b47bb2..0620cce 100644
--- a/metadata.c
+++ b/metadata.c
@@ -335,18 +335,20 @@ static int kdbus_meta_append_caps(struct kdbus_meta *meta)
 #ifdef CONFIG_CGROUPS
 static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
 {
-   char *tmp;
-   int ret;
+   char *buf, *path;
+int ret;
 
-   tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
-   if (!tmp)
+   buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
+   if (!buf)
return -ENOMEM;
 
-   ret = task_cgroup_path(current, tmp, PAGE_SIZE);
-   if (ret = 0)
-   ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, tmp);
+   path = task_cgroup_path(current, buf, PAGE_SIZE);
+   if (path)
+   ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, path);
+else
+ret = -ENAMETOOLONG;
 
-   free_page((unsigned long) tmp);
+   free_page((unsigned long) buf);
 
return ret;
 }
-- 
1.9.1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] metadata: reflect change in task_cgroup_name

2014-04-09 Thread Daniel Mack
On 04/09/2014 02:16 PM, Hristo Venev wrote:
 Change: e61734c55c24cdf11b07e52a74aec4dc4a7f4bd0.
 Merged: dc5ed40686a4da95881c35d913b60f867755cbe2 in 3.15-rc1.
 
 task_cgroup_name returns a pointer to the path or NULL if there is not
 enough space in the buffer (used to return nonnegative or -ENAMETOOLONG).

[...]

 - ret = task_cgroup_path(current, tmp, PAGE_SIZE);
 - if (ret = 0)
 - ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, tmp);
 + path = task_cgroup_path(current, buf, PAGE_SIZE);

Eh. Thanks for spotting this. However, I think we should have a compat
workaround for 3.14, for at least a couple of weeks. We can drop it
after that. Could you amend your patch for that?

Apart from that, please take care to follow the kernel CodingStyle. In
particular, we use tabs for indentation, not spaces.


Thanks,
Daniel

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] fsck: Search for fsck.type in PATH

2014-04-09 Thread Mike Gilbert
Matches default behavior in recent util-linux.
---
 src/fsck/fsck.c| 6 --
 src/shared/generator.c | 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/fsck/fsck.c b/src/fsck/fsck.c
index 18f2aca..24c8890 100644
--- a/src/fsck/fsck.c
+++ b/src/fsck/fsck.c
@@ -36,6 +36,7 @@
 #include bus-error.h
 #include bus-errors.h
 #include fileio.h
+#include path-util.h
 #include udev-util.h
 
 static bool arg_skip = false;
@@ -285,8 +286,9 @@ int main(int argc, char *argv[]) {
 
 type = udev_device_get_property_value(udev_device, ID_FS_TYPE);
 if (type) {
-const char *checker = strappenda(/sbin/fsck., type);
-r = access(checker, X_OK);
+const char *checker = strappenda(fsck., type);
+_cleanup_free_ char *command = NULL;
+r = find_binary(checker, command);
 if (r  0) {
 if (errno == ENOENT) {
 log_info(%s doesn't exist, not checking file 
system., checker);
diff --git a/src/shared/generator.c b/src/shared/generator.c
index 6110303..6f4eaae 100644
--- a/src/shared/generator.c
+++ b/src/shared/generator.c
@@ -24,6 +24,7 @@
 #include util.h
 #include special.h
 #include mkdir.h
+#include path-util.h
 #include unit-name.h
 #include generator.h
 
@@ -46,10 +47,11 @@ int generator_write_fsck_deps(
 
 if (!isempty(fstype)  !streq(fstype, auto)) {
 const char *checker;
+_cleanup_free_ char *command = NULL;
 int r;
 
-checker = strappenda(/sbin/fsck., fstype);
-r = access(checker, X_OK);
+checker = strappenda(fsck., fstype);
+r = find_binary(checker, command);
 if (r  0) {
 log_warning(Checking was requested for %s, but %s 
cannot be used: %m, what, checker);
 
-- 
1.9.1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] Document CONFIG_NET_NS as a required kernel option

2014-04-09 Thread Lennart Poettering
On Wed, 09.04.14 11:21, Tom Gundersen (t...@jklm.no) wrote:

 
 On Mon, Mar 31, 2014 at 8:28 PM, Mike Gilbert flop...@gentoo.org wrote:
  Several units now utilize the PrivateNetwork parameter, which requires
  network namespace support.

BTW, this really sounds like something where we should have graceful
degradation: if network namespaces are missing we should probably simply
ignore PrivateNetwork= (maybe print a one-time warning to syslog, just
to mentioned this), and proceed without them. After all this is a
feature that just takes away features, and doesn't add any, thus simply
ignoring it should be safe.

I'd be happy to merge a patch which implements such a scheme to support
kernels with a more limited feature set.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Move use of locale_t to different shared file, so that udev can still be built without it (MinimalBuild)

2014-04-09 Thread Lennart Poettering
On Wed, 09.04.14 09:40, Samuli Suominen (ssuomi...@gentoo.org) wrote:

  This is the *only* patch we are carrying for udev currently, otherwise
  uClibc builds work fine, so please at least consider what
  I just said.
 
  All this locale_t thing is standarized in POSIX 2008, . it is up to
  the particular libc to keep up with standards.
 
  systemd requires glibc, this is known and clear since the beginning.
 
 systemd-udevd doesn't, albeit their is secure_getenv but it's currently
 used only for logging, and note that there is still sysv support in the
 source tree

udevd only officially supports glibc, too. The entire systemd tree is
focussed on glibc, and nothing else.

If you want to run the stuff with other libcs, that's totally fine, but
please understand that incompatibilities with glibc need to be fixed in
those libcs, not in systemd. We will not add work-arounds for limited
libcs to systemd. 

Or to turn this around: if you want us to apply your locale_t change
then show us that you can compile recent glibcs without support for
locale_t. Otherwise we are not interested, sorry.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] nspawn: Fix erroneous OOM when building group list

2014-04-09 Thread Philip Lorenz
change_uid_gid() never initialises sz which may cause greedy_realloc to
skip the initial buffer allocation.
---
 src/nspawn/nspawn.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c
index 84724d5..0bd52da 100644
--- a/src/nspawn/nspawn.c
+++ b/src/nspawn/nspawn.c
@@ -2366,7 +2366,7 @@ static int change_uid_gid(char **_home) {
 _cleanup_fclose_ FILE *f = NULL;
 _cleanup_close_ int fd = -1;
 unsigned n_uids = 0;
-size_t sz, l;
+size_t sz = 0, l;
 uid_t uid;
 gid_t gid;
 pid_t pid;
-- 
1.9.1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] [RFC] Ignore OOMScoreAdjust in Linux containers

2014-04-09 Thread Tom Gundersen
On Mon, Apr 7, 2014 at 9:47 PM, Richard Weinberger rich...@nod.at wrote:
 At least LXC does not allow the container root to change
 the OOM Score adjust value.

 Signed-off-by: Richard Weinberger rich...@nod.at
 ---
 Hi!

 Within Linux containers we cannot use OOMScoreAdjust nor 
 CapabilityBoundingSet (and maybe
 more related settings).
 This patch tells systemd to ignore OOMScoreAdjust if it detects
 a container.

 Are you fine with such a change?
 Otherweise regular distros need a lot of changes in their .service file
 to make them work within LXC.

 As detect_virtualization() detects more than LXC we have to find out
 whether OOMScoreAdjust cannot be used on OpenVZ and other container as well.

 I'd volunteer to identify all settings and sending patches...

Hm, is there a fundamental reason why this is not possible in
containers in general, or is it simply an LXC restriction? Regardless,
would it not be best to simply degrade gracefully and ignore the
setting with a warning if it fails? See the comment Lennart just
posted on the recent PrivateNetwork= patch. This sounds like a very
similar situation.

Cheers,

Tom

  src/core/load-fragment.c | 7 +++
  1 file changed, 7 insertions(+)

 diff --git a/src/core/load-fragment.c b/src/core/load-fragment.c
 index c604f90..13f6107 100644
 --- a/src/core/load-fragment.c
 +++ b/src/core/load-fragment.c
 @@ -59,6 +59,7 @@
  #include bus-error.h
  #include errno-list.h
  #include af-list.h
 +#include virt.h

  #ifdef HAVE_SECCOMP
  #include seccomp-util.h
 @@ -423,6 +424,12 @@ int config_parse_exec_oom_score_adjust(const char* unit,
  assert(rvalue);
  assert(data);

 +if (detect_virtualization(NULL) == VIRTUALIZATION_CONTAINER) {
 +log_syntax(unit, LOG_ERR, filename, line, EPERM,
 +   Setting the OOM score adjust value is not 
 allowed within containers);
 +return 0;
 +}
 +
  r = safe_atoi(rvalue, oa);
  if (r  0) {
  log_syntax(unit, LOG_ERR, filename, line, -r,
 --
 1.8.4.2

 ___
 systemd-devel mailing list
 systemd-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/systemd-devel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] metadata: reflect change in task_cgroup_name

2014-04-09 Thread Hristo Venev
On Wed, 2014-04-09 at 15:04 +0200, Daniel Mack wrote:
 Eh. Thanks for spotting this. However, I think we should have a compat
 workaround for 3.14, for at least a couple of weeks. We can drop it
 after that. Could you amend your patch for that?

How do I check if the kernel version is 3.14.0 or 3.14.0+?
linux/version.h is the same. I've amended the patch to check for 3.15.0
so it will be broken until the release of 3.15.0-rc1.

 Apart from that, please take care to follow the kernel CodingStyle. In
 particular, we use tabs for indentation, not spaces.

Done.

---
 metadata.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/metadata.c b/metadata.c
index 5b47bb2..1dab96e 100644
--- a/metadata.c
+++ b/metadata.c
@@ -23,6 +23,7 @@
 #include linux/sizes.h
 #include linux/slab.h
 #include linux/uaccess.h
+#include linux/version.h
 
 #include connection.h
 #include message.h
@@ -335,18 +336,29 @@ static int kdbus_meta_append_caps(struct
kdbus_meta *meta)
 #ifdef CONFIG_CGROUPS
 static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
 {
-   char *tmp;
+   char *buf, *path;
int ret;
 
-   tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
-   if (!tmp)
+   buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
+   if (!buf)
return -ENOMEM;
 
-   ret = task_cgroup_path(current, tmp, PAGE_SIZE);
-   if (ret = 0)
-   ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, tmp);
+   #if LINUX_VERSION_CODE = KERNEL_VERSION(3,15,0)
+   path = task_cgroup_path(current, buf, PAGE_SIZE);
+   #else
+   ret = task_cgroup_path(current, buf, PAGE_SIZE);
+   if (ret  0)
+   path = NULL;
+   else
+   path = buf;
+   #endif
 
-   free_page((unsigned long) tmp);
+   if (path)
+   ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, path);
+   else
+   ret = -ENAMETOOLONG;
+
+   free_page((unsigned long) buf);
 
return ret;
 }
-- 
1.9.1


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] metadata: reflect change in task_cgroup_name

2014-04-09 Thread Daniel Mack
On 04/09/2014 07:20 PM, Hristo Venev wrote:
 On Wed, 2014-04-09 at 15:04 +0200, Daniel Mack wrote:
 Eh. Thanks for spotting this. However, I think we should have a compat
 workaround for 3.14, for at least a couple of weeks. We can drop it
 after that. Could you amend your patch for that?
 
 How do I check if the kernel version is 3.14.0 or 3.14.0+?
 linux/version.h is the same. I've amended the patch to check for 3.15.0
 so it will be broken until the release of 3.15.0-rc1.
 
 Apart from that, please take care to follow the kernel CodingStyle. In
 particular, we use tabs for indentation, not spaces.
 
 Done.

Thanks, applied with a small style fixup: I moved the # characters to
the first column.


Thanks,
Daniel


 
 ---
  metadata.c | 26 +++---
  1 file changed, 19 insertions(+), 7 deletions(-)
 
 diff --git a/metadata.c b/metadata.c
 index 5b47bb2..1dab96e 100644
 --- a/metadata.c
 +++ b/metadata.c
 @@ -23,6 +23,7 @@
  #include linux/sizes.h
  #include linux/slab.h
  #include linux/uaccess.h
 +#include linux/version.h
  
  #include connection.h
  #include message.h
 @@ -335,18 +336,29 @@ static int kdbus_meta_append_caps(struct
 kdbus_meta *meta)
  #ifdef CONFIG_CGROUPS
  static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
  {
 - char *tmp;
 + char *buf, *path;
   int ret;
  
 - tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
 - if (!tmp)
 + buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
 + if (!buf)
   return -ENOMEM;
  
 - ret = task_cgroup_path(current, tmp, PAGE_SIZE);
 - if (ret = 0)
 - ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, tmp);
 + #if LINUX_VERSION_CODE = KERNEL_VERSION(3,15,0)
 + path = task_cgroup_path(current, buf, PAGE_SIZE);
 + #else
 + ret = task_cgroup_path(current, buf, PAGE_SIZE);
 + if (ret  0)
 + path = NULL;
 + else
 + path = buf;
 + #endif
  
 - free_page((unsigned long) tmp);
 + if (path)
 + ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, path);
 + else
 + ret = -ENAMETOOLONG;
 +
 + free_page((unsigned long) buf);
  
   return ret;
  }
 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] [RFC] Ignore OOMScoreAdjust in Linux containers

2014-04-09 Thread Richard Weinberger
Am 09.04.2014 19:19, schrieb Tom Gundersen:
 On Mon, Apr 7, 2014 at 9:47 PM, Richard Weinberger rich...@nod.at wrote:
 At least LXC does not allow the container root to change
 the OOM Score adjust value.

 Signed-off-by: Richard Weinberger rich...@nod.at
 ---
 Hi!

 Within Linux containers we cannot use OOMScoreAdjust nor 
 CapabilityBoundingSet (and maybe
 more related settings).
 This patch tells systemd to ignore OOMScoreAdjust if it detects
 a container.

 Are you fine with such a change?
 Otherweise regular distros need a lot of changes in their .service file
 to make them work within LXC.

 As detect_virtualization() detects more than LXC we have to find out
 whether OOMScoreAdjust cannot be used on OpenVZ and other container as well.

 I'd volunteer to identify all settings and sending patches...
 
 Hm, is there a fundamental reason why this is not possible in
 containers in general, or is it simply an LXC restriction? Regardless,
 would it not be best to simply degrade gracefully and ignore the
 setting with a warning if it fails? See the comment Lennart just
 posted on the recent PrivateNetwork= patch. This sounds like a very
 similar situation.

Writing to oom_score_adj is disallowed by design within user namespaces.
Please see: https://lkml.org/lkml/2013/4/25/596

I'm also fine with ignoring OOMScoreAdjust if it fails.
All I want is a painless Linux userspace on top of systemd within
my Containers. :-)

Thanks,
//richard
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] metadata: reflect change in task_cgroup_name

2014-04-09 Thread Djalal Harouni
On Wed, Apr 09, 2014 at 07:28:42PM +0200, Daniel Mack wrote:
 On 04/09/2014 07:20 PM, Hristo Venev wrote:
  On Wed, 2014-04-09 at 15:04 +0200, Daniel Mack wrote:
  Eh. Thanks for spotting this. However, I think we should have a compat
  workaround for 3.14, for at least a couple of weeks. We can drop it
  after that. Could you amend your patch for that?
  
  How do I check if the kernel version is 3.14.0 or 3.14.0+?
  linux/version.h is the same. I've amended the patch to check for 3.15.0
  so it will be broken until the release of 3.15.0-rc1.
  
  Apart from that, please take care to follow the kernel CodingStyle. In
  particular, we use tabs for indentation, not spaces.
  
  Done.
 
 Thanks, applied with a small style fixup: I moved the # characters to
 the first column.

Daniel, the commit log suggests it's task_cgroup_name() where it should
be task_cgroup_path()

Thanks!

-- 
Djalal Harouni
http://opendz.org
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts

2014-04-09 Thread Vivek Goyal
On Wed, Apr 09, 2014 at 05:36:13PM +0800, WANG Chao wrote:
 On 04/08/14 at 06:02pm, Vivek Goyal wrote:
  On Tue, Apr 08, 2014 at 02:14:33AM +0200, Zbigniew Jędrzejewski-Szmek wrote:
  
  [..]
  Defining a new target which by default waits for all the local fs 
  target
  sounds interesting. Again, I have the question, what will happen to 
  local-fs-all.target if some device does not show up and say one of 
  the
  mounts specified in /etc/fstab fails.
   It result is different for Requires= and for Wants=. Iff there's a chain
   of Requires= from the failing unit (.device in this case) to the target 
   unit
   it will fail. Otherwise, it'll just be delayed. If, as I suggested above 
   local-fs-all.target
   would have Requires= on the .mount units, then your unit could still have
   Wants=/After=local-fs-all.target, and it'll be started even if some mounts
   fail.
  
  Thanks now I understand the difference between Requires= and Wants=
  better.
  
   
  What we want is.
  
  - Wait for all devices to show up as specified in /etc/fstab. Run 
  fsck
on devices. Mount devices to mount points specified.
  
  - If everything is successful, things are fine and 
  local-fs-all.target
will be reached.
  
  - If some device does not show up, or if fsck fails or mount fails, 
  still
local-fs-all.target should reach so that kdump module can detect 
  that
failure happened and can take alternative action.
   Alternatively, you can specify a soft depenendency on local-fs-all.target 
   by
   using Wants=local-fs-all.target. I think this is preferable, because we 
   want
   local-fs-all.target to be as similar as possible to local-fs.target, which
   has Requires= on the mount points.
   
   With this caveat, this should all be satisfied with my proposal.
  
  Agreed. We could define Wants=local-fs-all.target and that would make
  sure that our unit will be started even if local-fs-all.target fails.
  
   
 You can use OnFailure= to define unit(s) started when
 local-fs-all.target fails. But it sounds like you are not really
 interested in *all* filesystems, but in specific fileststems defined 
 in
 kdump configuration.

Kdump scripts registers with dracut as pre-pivot hook. And I believe
that in initramfs environments /etc/fstab does not contain all
filesystems. It prmarily contains root and any file system specified
on dracut command line using --mount option during initramfs generation.

So my understanding that given the fact that /etc/fstab is minimal in
initramfs, we should be fine waiting for all the fs specified. 

Given the fact that we run under dracut pre-pivot hook callback, I think
dracut-pre-pivot.service wil have to create a dependency to run after
local-fs-all.target is reached.
   Hm, maybe. It would be good to get some input from Harald here.
   This is pretty specialized, so maybe it'd be better to have a separate 
   unit
   positioned before or after or parallel to dracut-pre-pivot.service.
  
  I am just thinking loud now. Taking a step back and going back to
  figure out why did we introduce nofail to begin with.
  
  If I go through kexec-tools logs, it says nofail was introduced
  otherwise we never reach initrd.target. I am wondering why that's the
  case. Current initrd.target seems to have following.
  
  [Unit]
  Description=Initrd Target
  Requires=basic.target
  Conflicts=rescue.service rescue.target
  After=basic.target rescue.service rescue.target
  AllowIsolate=yes
  OnFailure=emergency.target
  OnFailureIsolate=yes
  ConditionPathExists=/etc/initrd-release
 
 dracut doesn't use this initrd.target. It uses the stock one from
 systemd:
 
 [Unit]
 Description=Initrd Default Target
 Documentation=man:systemd.special(7)
 OnFailure=emergency.target
 OnFailureIsolate=yes
 ConditionPathExists=/etc/initrd-release
 Requires=basic.target
 Wants=initrd-root-fs.target initrd-fs.target initrd-parse-etc.service
 After=initrd-root-fs.target initrd-fs.target basic.target
 rescue.service rescue.target
 AllowIsolate=yes
 
 In sysroot.mount context, if we don't use nofail in case of root disk
 failure, we will never reach initrd-root-fs.target and hence we never
 reach initrd.target and dracut-pre-povit.service never get a chance to
 start.

Ok, I want to understand what is never reach a target means.

So with nofail opion for rootfs we should have following situation.

- sysroot.mount
Before=initrd-root-fs.target
- initrd-root-fs.target
Requires=sysroot.mount
OnFailure=emergency.target
- initrd.target
Wants=initrd-root-fs.target
OnFailure=emergency.target
- dracut-pre-pivot.service
After=initrd.target sysroot.mount

Now let us say sysroot.mount failed activation because root device did not
show up. We waited for certain time interval, then time out. Now what will
happen to 

[systemd-devel] Trying to debug bug 76468

2014-04-09 Thread Umut Tezduyar Lindskog
Hi,

Trying to debug https://bugs.freedesktop.org/show_bug.cgi?id=76468
which seems to be sd_bus related.

Problem is happening when we mark the sd_event as SD_EVENT_FINISHED
and then tring to call sd_event_source_set_enabled on an event source
that belongs to FINISHED sd_event.

Inside the sd_bus_detach_event() function, isn't it enough to just
unref the sd_bus event sources instead of both calling
sd_event_source_set_enabled() and then unrefing them?

Thanks.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] [RFC] Ignore OOMScoreAdjust in Linux containers

2014-04-09 Thread Tom Gundersen
On Wed, Apr 9, 2014 at 7:39 PM, Richard Weinberger rich...@nod.at wrote:
 Am 09.04.2014 19:19, schrieb Tom Gundersen:
 On Mon, Apr 7, 2014 at 9:47 PM, Richard Weinberger rich...@nod.at wrote:
 At least LXC does not allow the container root to change
 the OOM Score adjust value.

 Signed-off-by: Richard Weinberger rich...@nod.at
 ---
 Hi!

 Within Linux containers we cannot use OOMScoreAdjust nor 
 CapabilityBoundingSet (and maybe
 more related settings).
 This patch tells systemd to ignore OOMScoreAdjust if it detects
 a container.

 Are you fine with such a change?
 Otherweise regular distros need a lot of changes in their .service file
 to make them work within LXC.

 As detect_virtualization() detects more than LXC we have to find out
 whether OOMScoreAdjust cannot be used on OpenVZ and other container as well.

 I'd volunteer to identify all settings and sending patches...

 Hm, is there a fundamental reason why this is not possible in
 containers in general, or is it simply an LXC restriction? Regardless,
 would it not be best to simply degrade gracefully and ignore the
 setting with a warning if it fails? See the comment Lennart just
 posted on the recent PrivateNetwork= patch. This sounds like a very
 similar situation.

 Writing to oom_score_adj is disallowed by design within user namespaces.
 Please see: https://lkml.org/lkml/2013/4/25/596

But I guess we still want to use this in containers that don't use
user namespaces.

 I'm also fine with ignoring OOMScoreAdjust if it fails.

Sounds like the right way (might be other things like this too I suppose).

 All I want is a painless Linux userspace on top of systemd within
 my Containers. :-)

:)

-t
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] LXC not working with systemd 209 or later

2014-04-09 Thread Leonid Isaev
Hi,

On Sat, 05 Apr 2014 22:04:05 +0100
John Lane syst...@jelmail.com wrote:

 [...]

 Ok, now this is wierd. I have distilled the problem down to the bare bones.
 I have a build_container script 
 (http://pastebin.com/raw.php?i=RhDFhRZi) that will create a container 
 called testcontainer. It exhibits the problems I see. Now, if I rename 
 that container to, say testc, and restart it (changing nothing else at 
 all) then it works fine.
 
 I am totally confused but it appears that the container's name affects 
 how systemd operates...?

It is not the name but special characters in it -- is there anything wierd
about your locale, etc. settings?

 
 if you can try it and see if the same happens to you that would be very 
 helpful.
 
 $ ./build_container
 
 $ lxc-start -n testcontainer
 
 it starts: will see journal output in the console boot messages, like 
 this: 30systemd[1]: Set hostname to test.
 you can log in as root. no password. Long delay. Eventual 
 user@0.service start operation timed out. Terminating
 You can then halt. slow to stop. user@0.service start operation timed 
 out. Terminating takes 90 seconds.
 Eventually stops, host prompt returned.

I ran your script in a freshly installed archlinux x86_64 VM and couldn't
reproduce what you are seeing, regardless of how I call the container...

 [...]
 Actually, you can avoid the above. Here's another test with just 
 lxc-create
 
 $ lxc-create -n testcontainer -t archlinux -- -P util-linux
 $ lxc-start -n testcontainer
 
 Same problem.

Same as above, no problem. 

Cheers,
-- 
Leonid Isaev
GnuPG key fingerprint: C0DF 20D0 C075 C3F1 E1BE  775A A7AE F6CB 164B 5A6D


signature.asc
Description: PGP signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] Add Mir to the list of session types

2014-04-09 Thread David Herrmann
Hi

On Thu, Apr 3, 2014 at 10:46 PM, Robert Ancell
robert.anc...@canonical.com wrote:
 Add Mir to the list of session types. This is implemented for LightDM
 in lp:~robert-ancell/lightdm/xdg-session-desktop [1].

 [1] 
 https://code.launchpad.net/~robert-ancell/lightdm/xdg-session-desktop/+merge/214108

 ---
  man/pam_systemd.xml  | 5 +++--
  man/sd_session_is_active.xml | 6 +++---
  src/login/logind-session.c   | 1 +
  src/login/logind-session.h   | 1 +
  src/systemd/sd-login.h   | 2 +-
  5 files changed, 9 insertions(+), 6 deletions(-)

Applied and pushed. But please use git-send-email next time as the
white-spaces are all broken in that patch.

Thanks
David
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] [RFC] Ignore OOMScoreAdjust in Linux containers

2014-04-09 Thread Richard Weinberger
Am 09.04.2014 20:28, schrieb Tom Gundersen:
 On Wed, Apr 9, 2014 at 7:39 PM, Richard Weinberger rich...@nod.at wrote:
 Am 09.04.2014 19:19, schrieb Tom Gundersen:
 On Mon, Apr 7, 2014 at 9:47 PM, Richard Weinberger rich...@nod.at wrote:
 At least LXC does not allow the container root to change
 the OOM Score adjust value.

 Signed-off-by: Richard Weinberger rich...@nod.at
 ---
 Hi!

 Within Linux containers we cannot use OOMScoreAdjust nor 
 CapabilityBoundingSet (and maybe
 more related settings).
 This patch tells systemd to ignore OOMScoreAdjust if it detects
 a container.

 Are you fine with such a change?
 Otherweise regular distros need a lot of changes in their .service file
 to make them work within LXC.

 As detect_virtualization() detects more than LXC we have to find out
 whether OOMScoreAdjust cannot be used on OpenVZ and other container as 
 well.

 I'd volunteer to identify all settings and sending patches...

 Hm, is there a fundamental reason why this is not possible in
 containers in general, or is it simply an LXC restriction? Regardless,
 would it not be best to simply degrade gracefully and ignore the
 setting with a warning if it fails? See the comment Lennart just
 posted on the recent PrivateNetwork= patch. This sounds like a very
 similar situation.

 Writing to oom_score_adj is disallowed by design within user namespaces.
 Please see: https://lkml.org/lkml/2013/4/25/596
 
 But I guess we still want to use this in containers that don't use
 user namespaces.

Containers without user namespaces and a uid 0 user are horrible broken
and insecure.
They will hopefully die soon.

 I'm also fine with ignoring OOMScoreAdjust if it fails.
 
 Sounds like the right way (might be other things like this too I suppose).

Okay, I'll send patches for OOMScoreAdjust and other settings to ignore 
failures.
This way systemd can also support containers without user namespaces.
No matter how useful these are. (hello docker.io folks! ;))

Thanks,
//richard
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] cgroup: After MemmoryAccounting=yes running scope has no memusage

2014-04-09 Thread Lennart Poettering
On Tue, 08.04.14 12:11, Stef Walter (s...@thewalter.net) wrote:

 Setting the 'MemoryAccounting' unit property to true, puts the
 unit into the right cgroup but does the memory.usage_in_bytes
 does not reflect the pages already allocated to the processes
 in that cgroup.
 
 This is because the memory.move_charge_at_immigrate needs to be
 set before migrating processes to the new cgroup memory
 controller path:
 
 https://www.kernel.org/doc/Documentation/cgroups/memory.txt
 
 The attached path sets the memory.move_charge_at_immigrate to
 0x01 | 0x02 before migrating processes to a new memory cgroup.

To keep the list posted about this: we talked to the cgroup kernel guys
about this and while it appears like the right thing to do this the
kernel logic behind the option doesn't work correctly, and hence we
shouldn't do this for now. And it appears likely that when the kernel
fixes the behaviour it will be turned on by default anyway, without
involving userspace.

Stef, is that a big problem for you? Not sure what else we can do about
this for now though...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] [RFC] Ignore OOMScoreAdjust in Linux containers

2014-04-09 Thread Cristian Rodríguez

El 09/04/14 16:41, Richard Weinberger escribió:
e other things like this too I suppose).


Okay, I'll send patches for OOMScoreAdjust and other settings to ignore 
failures.
This way systemd can also support containers without user namespaces.
No matter how useful these are. (hello docker.io folks! ;))


Ensure you write a log_info() entry when it is ignored.. and the reason 
why it was.




--
Cristian
I don't know the key to success, but the key to failure is trying to 
please everybody.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] Document CONFIG_NET_NS as a required kernel option

2014-04-09 Thread Zbigniew Jędrzejewski-Szmek
On Wed, Apr 09, 2014 at 07:35:15PM -0400, Mike Gilbert wrote:
 On Wed, Apr 9, 2014 at 12:32 PM, Lennart Poettering
 lenn...@poettering.net wrote:
  On Wed, 09.04.14 11:21, Tom Gundersen (t...@jklm.no) wrote:
 
 
  On Mon, Mar 31, 2014 at 8:28 PM, Mike Gilbert flop...@gentoo.org wrote:
   Several units now utilize the PrivateNetwork parameter, which requires
   network namespace support.
 
  BTW, this really sounds like something where we should have graceful
  degradation: if network namespaces are missing we should probably simply
  ignore PrivateNetwork= (maybe print a one-time warning to syslog, just
  to mentioned this), and proceed without them. After all this is a
  feature that just takes away features, and doesn't add any, thus simply
  ignoring it should be safe.
 
  I'd be happy to merge a patch which implements such a scheme to support
  kernels with a more limited feature set.
 
 
 Are there any examples of how to implement a one-time warning for
 this sort of thing?
Not in systemd, I think.

Simply try:
  int func(...) {
 static bool network_namespace_warning = false;
 ...

 if (!network_namespace_warning) {
 network_namespace_warning = true;
 log_warn(...);
 }
  }

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts

2014-04-09 Thread Andrey Borzenkov
В Wed, 9 Apr 2014 13:49:47 -0400
Vivek Goyal vgo...@redhat.com пишет:

 On Wed, Apr 09, 2014 at 05:36:13PM +0800, WANG Chao wrote:
  On 04/08/14 at 06:02pm, Vivek Goyal wrote:
   On Tue, Apr 08, 2014 at 02:14:33AM +0200, Zbigniew Jędrzejewski-Szmek 
   wrote:
   
   [..]
   Defining a new target which by default waits for all the local fs 
   target
   sounds interesting. Again, I have the question, what will happen 
   to 
   local-fs-all.target if some device does not show up and say one 
   of the
   mounts specified in /etc/fstab fails.
It result is different for Requires= and for Wants=. Iff there's a chain
of Requires= from the failing unit (.device in this case) to the target 
unit
it will fail. Otherwise, it'll just be delayed. If, as I suggested 
above local-fs-all.target
would have Requires= on the .mount units, then your unit could still 
have
Wants=/After=local-fs-all.target, and it'll be started even if some 
mounts
fail.
   
   Thanks now I understand the difference between Requires= and Wants=
   better.
   

   What we want is.
   
   - Wait for all devices to show up as specified in /etc/fstab. Run 
   fsck
 on devices. Mount devices to mount points specified.
   
   - If everything is successful, things are fine and 
   local-fs-all.target
 will be reached.
   
   - If some device does not show up, or if fsck fails or mount 
   fails, still
 local-fs-all.target should reach so that kdump module can 
   detect that
 failure happened and can take alternative action.
Alternatively, you can specify a soft depenendency on 
local-fs-all.target by
using Wants=local-fs-all.target. I think this is preferable, because we 
want
local-fs-all.target to be as similar as possible to local-fs.target, 
which
has Requires= on the mount points.

With this caveat, this should all be satisfied with my proposal.
   
   Agreed. We could define Wants=local-fs-all.target and that would make
   sure that our unit will be started even if local-fs-all.target fails.
   

  You can use OnFailure= to define unit(s) started when
  local-fs-all.target fails. But it sounds like you are not really
  interested in *all* filesystems, but in specific fileststems 
  defined in
  kdump configuration.
 
 Kdump scripts registers with dracut as pre-pivot hook. And I believe
 that in initramfs environments /etc/fstab does not contain all
 filesystems. It prmarily contains root and any file system specified
 on dracut command line using --mount option during initramfs 
 generation.
 
 So my understanding that given the fact that /etc/fstab is minimal in
 initramfs, we should be fine waiting for all the fs specified. 
 
 Given the fact that we run under dracut pre-pivot hook callback, I 
 think
 dracut-pre-pivot.service wil have to create a dependency to run after
 local-fs-all.target is reached.
Hm, maybe. It would be good to get some input from Harald here.
This is pretty specialized, so maybe it'd be better to have a separate 
unit
positioned before or after or parallel to dracut-pre-pivot.service.
   
   I am just thinking loud now. Taking a step back and going back to
   figure out why did we introduce nofail to begin with.
   
   If I go through kexec-tools logs, it says nofail was introduced
   otherwise we never reach initrd.target. I am wondering why that's the
   case. Current initrd.target seems to have following.
   
   [Unit]
   Description=Initrd Target
   Requires=basic.target
   Conflicts=rescue.service rescue.target
   After=basic.target rescue.service rescue.target
   AllowIsolate=yes
   OnFailure=emergency.target
   OnFailureIsolate=yes
   ConditionPathExists=/etc/initrd-release
  
  dracut doesn't use this initrd.target. It uses the stock one from
  systemd:
  
  [Unit]
  Description=Initrd Default Target
  Documentation=man:systemd.special(7)
  OnFailure=emergency.target
  OnFailureIsolate=yes
  ConditionPathExists=/etc/initrd-release
  Requires=basic.target
  Wants=initrd-root-fs.target initrd-fs.target initrd-parse-etc.service
  After=initrd-root-fs.target initrd-fs.target basic.target
  rescue.service rescue.target
  AllowIsolate=yes
  
  In sysroot.mount context, if we don't use nofail in case of root disk
  failure, we will never reach initrd-root-fs.target and hence we never
  reach initrd.target and dracut-pre-povit.service never get a chance to
  start.
 
 Ok, I want to understand what is never reach a target means.
 
 So with nofail opion for rootfs we should have following situation.
 
 - sysroot.mount
   Before=initrd-root-fs.target
 - initrd-root-fs.target
   Requires=sysroot.mount
   OnFailure=emergency.target
 - initrd.target
   Wants=initrd-root-fs.target
   OnFailure=emergency.target
 - 

[systemd-devel] [PATCH] units: add ConditionPathIsReadWrite for systemd-random-seed.service

2014-04-09 Thread Jonathan Liu
---
 units/systemd-random-seed.service.in | 1 +
 1 file changed, 1 insertion(+)

diff --git a/units/systemd-random-seed.service.in 
b/units/systemd-random-seed.service.in
index 1879b2f..cbe000c 100644
--- a/units/systemd-random-seed.service.in
+++ b/units/systemd-random-seed.service.in
@@ -13,6 +13,7 @@ RequiresMountsFor=@RANDOM_SEED@
 Conflicts=shutdown.target
 After=systemd-readahead-collect.service systemd-readahead-replay.service 
systemd-remount-fs.service
 Before=sysinit.target shutdown.target
+ConditionPathIsReadWrite=@RANDOM_SEED_DIR@
 
 [Service]
 Type=oneshot
-- 
1.9.1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] cgroup: After MemmoryAccounting=yes running scope has no memusage

2014-04-09 Thread Stef Walter
On 09.04.2014 23:45, Lennart Poettering wrote:
 On Tue, 08.04.14 12:11, Stef Walter (s...@thewalter.net) wrote:
 
 Setting the 'MemoryAccounting' unit property to true, puts the
 unit into the right cgroup but does the memory.usage_in_bytes
 does not reflect the pages already allocated to the processes
 in that cgroup.

 This is because the memory.move_charge_at_immigrate needs to be
 set before migrating processes to the new cgroup memory
 controller path:

 https://www.kernel.org/doc/Documentation/cgroups/memory.txt

 The attached path sets the memory.move_charge_at_immigrate to
 0x01 | 0x02 before migrating processes to a new memory cgroup.
 
 To keep the list posted about this: we talked to the cgroup kernel guys
 about this and while it appears like the right thing to do this the
 kernel logic behind the option doesn't work correctly, and hence we
 shouldn't do this for now. And it appears likely that when the kernel
 fixes the behaviour it will be turned on by default anyway, without
 involving userspace.
 
 Stef, is that a big problem for you? Not sure what else we can do about
 this for now though...

Yeah, this prevents us from displaying the amount of memory used by a
unit (or container) without first restarting it :S

But you can't gold plate a turd... So we probably have to expose the
limitations of the underlying stuff rather than work around them.

Stef

-- 

s...@thewalter.net
http://stef.thewalter.net
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel