Re: [systemd-devel] CentOS CI test suites failing

2019-05-28 Thread František Šumšal
The out-of-space issue seems to be resolved, for now, and the systemd CentOS CI 
script
should respect the (apparently) newly introduced rate-limiting machinery.
I went through the PRs updated in the last few hours and re-triggered all 
CentOS CI
jobs, so it now eating through the backlog. Given there's over 20 jobs to run,
it might take a good portion of the night (it's a midnight here), so please be 
patient :-)

I'll check the status once again in the morning and try to go through any 
unexpected
failures (hopefully there won't be any).

On 5/28/19 10:35 PM, František Šumšal wrote:
> This might take a little bit longer than anticipated, as the Jenkins slave 
> also ran
> out of space.
> 
> On 5/28/19 10:02 PM, František Šumšal wrote:
>> Hello!
>>
>> Thanks for the heads up. This was unfortunately caused by two simultaneous 
>> issues:
>>
>> 1) CentOS CI pool ran out of pre-installed machines
>> 2) I forgot to handle such situation in the systemd CentOS CI code :-)
>>
>> After giving the CentOS CI a few hours to get back on its tracks, I'll 
>> shortly
>> merge a patch[0] to handle it on the systemd side, and start slowly 
>> re-triggering
>> failed jobs for PRs.
>>
>> [0] https://github.com/systemd/systemd-centos-ci/pull/120
>>
>> On 5/28/19 8:39 PM, zach wrote:
>>> Looks like CentOS CI test suites are failing on multiple PRs with errors 
>>> like those listed below. Any advise on how to get these passing again or 
>>> who I could reach out to for help getting this back in order?
>>>
>>> 2019-05-28 16:04:26,307 [agent-control/] ERROR: Execution failed
>>> Traceback (most recent call last):
>>>   File "./agent-control.py", line 371, in 
>>> node, ssid = ac.allocate_node(args.version, args.arch)
>>>   File "./agent-control.py", line 82, in allocate_node
>>> jroot = json.loads(res)
>>>   File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
>>> return _default_decoder.decode(s)
>>>   File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
>>> obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>>>   File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
>>> raise ValueError("No JSON object could be decoded")
>>> ValueError: No JSON object could be decoded
>>> Traceback (most recent call last):
>>>   File "./agent-control.py", line 449, in 
>>> ac.free_session(ssid)
>>> NameError: name 'ssid' is not defined
>>> mv: cannot stat ‘artifacts_*’: No such file or director 
>>>
>>>
>>> ___
>>> systemd-devel mailing list
>>> systemd-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>>
>>
>>
>>
>> ___
>> systemd-devel mailing list
>> systemd-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>
> 
> 
> 
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 


-- 
Frantisek Sumsal
GPG key ID: 0xFB738CE27B634E4B



signature.asc
Description: OpenPGP digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] CentOS CI test suites failing

2019-05-28 Thread František Šumšal
This might take a little bit longer than anticipated, as the Jenkins slave also 
ran
out of space.

On 5/28/19 10:02 PM, František Šumšal wrote:
> Hello!
> 
> Thanks for the heads up. This was unfortunately caused by two simultaneous 
> issues:
> 
> 1) CentOS CI pool ran out of pre-installed machines
> 2) I forgot to handle such situation in the systemd CentOS CI code :-)
> 
> After giving the CentOS CI a few hours to get back on its tracks, I'll shortly
> merge a patch[0] to handle it on the systemd side, and start slowly 
> re-triggering
> failed jobs for PRs.
> 
> [0] https://github.com/systemd/systemd-centos-ci/pull/120
> 
> On 5/28/19 8:39 PM, zach wrote:
>> Looks like CentOS CI test suites are failing on multiple PRs with errors 
>> like those listed below. Any advise on how to get these passing again or who 
>> I could reach out to for help getting this back in order?
>>
>> 2019-05-28 16:04:26,307 [agent-control/] ERROR: Execution failed
>> Traceback (most recent call last):
>>   File "./agent-control.py", line 371, in 
>> node, ssid = ac.allocate_node(args.version, args.arch)
>>   File "./agent-control.py", line 82, in allocate_node
>> jroot = json.loads(res)
>>   File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
>> return _default_decoder.decode(s)
>>   File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
>> obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>>   File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
>> raise ValueError("No JSON object could be decoded")
>> ValueError: No JSON object could be decoded
>> Traceback (most recent call last):
>>   File "./agent-control.py", line 449, in 
>> ac.free_session(ssid)
>> NameError: name 'ssid' is not defined
>> mv: cannot stat ‘artifacts_*’: No such file or director 
>>
>>
>> ___
>> systemd-devel mailing list
>> systemd-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>
> 
> 
> 
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 


-- 
Frantisek Sumsal
GPG key ID: 0xFB738CE27B634E4B



signature.asc
Description: OpenPGP digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] CentOS CI test suites failing

2019-05-28 Thread František Šumšal
Hello!

Thanks for the heads up. This was unfortunately caused by two simultaneous 
issues:

1) CentOS CI pool ran out of pre-installed machines
2) I forgot to handle such situation in the systemd CentOS CI code :-)

After giving the CentOS CI a few hours to get back on its tracks, I'll shortly
merge a patch[0] to handle it on the systemd side, and start slowly 
re-triggering
failed jobs for PRs.

[0] https://github.com/systemd/systemd-centos-ci/pull/120

On 5/28/19 8:39 PM, zach wrote:
> Looks like CentOS CI test suites are failing on multiple PRs with errors like 
> those listed below. Any advise on how to get these passing again or who I 
> could reach out to for help getting this back in order?
> 
> 2019-05-28 16:04:26,307 [agent-control/] ERROR: Execution failed
> Traceback (most recent call last):
>   File "./agent-control.py", line 371, in 
> node, ssid = ac.allocate_node(args.version, args.arch)
>   File "./agent-control.py", line 82, in allocate_node
> jroot = json.loads(res)
>   File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
> return _default_decoder.decode(s)
>   File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
> obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>   File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
> raise ValueError("No JSON object could be decoded")
> ValueError: No JSON object could be decoded
> Traceback (most recent call last):
>   File "./agent-control.py", line 449, in 
> ac.free_session(ssid)
> NameError: name 'ssid' is not defined
> mv: cannot stat ‘artifacts_*’: No such file or director 
> 
> 
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 


-- 
Frantisek Sumsal
GPG key ID: 0xFB738CE27B634E4B



signature.asc
Description: OpenPGP digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] CentOS CI test suites failing

2019-05-28 Thread zach
Looks like CentOS CI test suites are failing on multiple PRs with errors
like those listed below. Any advise on how to get these passing again or
who I could reach out to for help getting this back in order?

2019-05-28 16:04:26,307 [agent-control/] ERROR: Execution failed
Traceback (most recent call last):
  File "./agent-control.py", line 371, in 
node, ssid = ac.allocate_node(args.version, args.arch)
  File "./agent-control.py", line 82, in allocate_node
jroot = json.loads(res)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Traceback (most recent call last):
  File "./agent-control.py", line 449, in 
ac.free_session(ssid)
NameError: name 'ssid' is not defined
mv: cannot stat ‘artifacts_*’: No such file or director
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Josef Moellers
On 28.05.19 16:59, Lennart Poettering wrote:
> On Di, 28.05.19 14:04, Josef Moellers (jmoell...@suse.de) wrote:
> 
>>> Regarding the syscall groupings: yes, the groups exist precisely to
>>> improve cases like this. That said, I think we should be careful not
>>> have an inflation of groups, and we should ask twice whether a group
>>> is really desirable before adding it. I'd argue in the open/openat
>>> case the case is not strong enough though: writing a filter
>>> blacklisting those is very difficult, as it means you cannot run
>>> programs with dynamic libraries (as loading those requires
>>> open/openat), which hence limits the applications very much and
>>> restricts its use to very few, very technical cases. In those case I
>>> have the suspicion the writer of the filters needs to know in very
>>> much detail what the semantics are anyway, and he hence isn't helped
>>> too much by this group.
>>>
>>> Note that the @file-system group already includes both, so maybe
>>> that's a more adequate solution? (not usable for blacklisting though,
>>> only for whirelisting, realistically).
>>>
>>> Hence, I would argue this is a documentation issue, not a bug
>>> really... Does that make sense?
>> Yes.
>>
>> Linux has always been a moving target and in very many circumstances
>> this has been A Good Idea!
>> I guess I'm too much old school and try to keep to the principle of
>> least surprise.
> 
> I added some docs about this to this PR:
> 
> https://github.com/systemd/systemd/pull/12686
> 
> ptal!

... and in the section about SyscallErrorNumber, there is a duplicate
remark:

See (see errno3
for a full list) for a full list of error codes.

... unless this is somehow mangled by the documetation builder.

Josef
-- 
SUSE Linux GmbH
Maxfeldstrasse 5
90409 Nuernberg
Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Josef Moellers
On 28.05.19 16:59, Lennart Poettering wrote:
> On Di, 28.05.19 14:04, Josef Moellers (jmoell...@suse.de) wrote:
> 
>>> Regarding the syscall groupings: yes, the groups exist precisely to
>>> improve cases like this. That said, I think we should be careful not
>>> have an inflation of groups, and we should ask twice whether a group
>>> is really desirable before adding it. I'd argue in the open/openat
>>> case the case is not strong enough though: writing a filter
>>> blacklisting those is very difficult, as it means you cannot run
>>> programs with dynamic libraries (as loading those requires
>>> open/openat), which hence limits the applications very much and
>>> restricts its use to very few, very technical cases. In those case I
>>> have the suspicion the writer of the filters needs to know in very
>>> much detail what the semantics are anyway, and he hence isn't helped
>>> too much by this group.
>>>
>>> Note that the @file-system group already includes both, so maybe
>>> that's a more adequate solution? (not usable for blacklisting though,
>>> only for whirelisting, realistically).
>>>
>>> Hence, I would argue this is a documentation issue, not a bug
>>> really... Does that make sense?
>> Yes.
>>
>> Linux has always been a moving target and in very many circumstances
>> this has been A Good Idea!
>> I guess I'm too much old school and try to keep to the principle of
>> least surprise.
> 
> I added some docs about this to this PR:
> 
> https://github.com/systemd/systemd/pull/12686
> 
> ptal!

BTDT. Could it be that it should be the other way around?

Note that various kernel system calls are defined redundantly:
there are multiple system calls
for executing the same operation. For example, the
pidfd_send_signal() system
call may be used to execute operations similar to what can be done with
the older
kill() system call, hence blocking the latter
without the former only provides
weak protection. Since new system calls are added regularly to the
kernel as development progresses
keeping system call whitelists comprehensive requires constant work. It
is thus recommended to use
blacklisting instead, which offers the benefit that new system calls are
by default implicitly
blocked until the whitelist is updated.

Shouldn't that be "keeping system call blacklists comprehensive" and
"thus recommended to use whitelisting instead,"

Josef
-- 
SUSE Linux GmbH
Maxfeldstrasse 5
90409 Nuernberg
Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Lennart Poettering
On Di, 28.05.19 14:04, Josef Moellers (jmoell...@suse.de) wrote:

> > Regarding the syscall groupings: yes, the groups exist precisely to
> > improve cases like this. That said, I think we should be careful not
> > have an inflation of groups, and we should ask twice whether a group
> > is really desirable before adding it. I'd argue in the open/openat
> > case the case is not strong enough though: writing a filter
> > blacklisting those is very difficult, as it means you cannot run
> > programs with dynamic libraries (as loading those requires
> > open/openat), which hence limits the applications very much and
> > restricts its use to very few, very technical cases. In those case I
> > have the suspicion the writer of the filters needs to know in very
> > much detail what the semantics are anyway, and he hence isn't helped
> > too much by this group.
> >
> > Note that the @file-system group already includes both, so maybe
> > that's a more adequate solution? (not usable for blacklisting though,
> > only for whirelisting, realistically).
> >
> > Hence, I would argue this is a documentation issue, not a bug
> > really... Does that make sense?
> Yes.
>
> Linux has always been a moving target and in very many circumstances
> this has been A Good Idea!
> I guess I'm too much old school and try to keep to the principle of
> least surprise.

I added some docs about this to this PR:

https://github.com/systemd/systemd/pull/12686

ptal!

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Password agent for user services

2019-05-28 Thread Simon McVittie
On Mon, 20 May 2019 at 11:49:42 +0200, Lennart Poettering wrote:
> Ideally some infrastructure like PK would supply this mechanism
> instead of us btw.

polkit is for controlled privilege escalation where an unprivileged user
asks a privileged system service to do something, and the system service
asks polkit whether that should be allowed to happen, with possible answers
that include yes, no, or a sudo-like "only if you re-authenticate first".
It also isn't an early-boot service (it needs D-Bus).

Things like prompting for the password for a LUKS volume are really
outside the scope of polkit, but it might make sense for there to be
some lower-level system-wide password prompting concept that can be used
by multiple things that need passwords: systemd, LUKS volume mounting,
polkit agents (the part that implements the "only if you re-authenticate"
policy), gnome-keyring, sudo and so on.

smcv
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Wiktor Kwapisiewicz

Hi Lennart,

On 28.05.2019 14:02, Lennart Poettering wrote:

The kernel devices are currently single-use only. Most of these fields
are exported via sysfs too however:

 grep . /sys/class/watchdog/watchdog0/*


Oh, that's very useful and indeed, there is timeleft property there and 
it's changing.



I think it might make sense if util-linux' wdctl would alternatively
use the sysfs API for querying these bits, it should be fairly easy to
add, please file a bug requesting that on util-linux github page!


The bug has been filed under this URL:
https://github.com/karelzak/util-linux/issues/804

Thank you for your assistance!

Kind regards,
Wiktor

--
https://metacode.biz/@wiktor
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Wiktor Kwapisiewicz

On 28.05.2019 14:00, Zbigniew Jędrzejewski-Szmek wrote:

This currently isn't exported by systemd, and there's even no log
message at debug level. I guess this could be exposed, but I don't
think it'd be very useful. If the watchdog ping works, most people
don't need to look at it. If it doesn't, the machine should reboot...


Yes. Maybe I should be more clear in my goal: enabling the systemd 
option to ping the watchdog doesn't have any visible effects until the 
system hangs. I wanted to see an "effect" without resorting to "echo c > 
/proc/sysrq-trigger".



If this is just for debugging, you can do something like

   sudo strace -e ioctl -p 1

and look for WDIOC_KEEPALIVE.


Yes, this is exactly what I was looking for! As far as I can see adding 
"-r" to strace prints nice relative times that correspond to the 
"RuntimeWatchdogSec" value.


Thank you very much!

Kind regards,
Wiktor

--
https://metacode.biz/@wiktor
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Josef Moellers
On 28.05.19 13:57, Lennart Poettering wrote:
> On Di, 28.05.19 11:43, Josef Moellers (jmoell...@suse.de) wrote:
> 
>> Hi,
>>
>> We just had an issue with a partner who tried to filter out the "open"
>> system call:
>>
>> . This may, in general, not be a very clever idea because how is one to
>> load a shared library to start with, but this example has revealed
>> something problematic ...
>>  SystemCallFilter=~open
>> The problem the partner had was that the filter just didn't work. No
>> matter what he tried, the test program ran to completion.
>>
>> It took us some time to figure out what caused this:
>> The test program relied on the fact that when it called open(), that the
>> "open" system call would be used, which it doesn't any more. It uses the
>> "openat" system call instead (*).
>> Now it appears that this change is deliberate and so my question is what
>> to do about these cases.
>> Should one
>> * also filter out "openat" if only "open" is required?
>> * introduce a new group "@open" which filters both?
>>
>> I regard "SystemCallFilter" as a security measure and if one cannot rely
>> on mechanisms any more, what good is such a feature?
> 
> This is a general problem: as the kernel is developed, new system
> calls are added, and very often they provide redundat mechanisms to
> earlier system calls. Thiis is the case with openat() vs. open(), and
> for example very recently with pidfd_send_signal() vs. kill(). We will
> always play catch-up with that unfortunately, it's an inherent problem
> of the system call interface and applying blacklist filters to it, and
> we are not going to solve that, I fear.
> 
> I think we should add docs about this case, underlining two things:
> 
> 1. first of all: whitelist, don't blacklist! If so, all new system
>calls are blocked by default, and thus the problem goes away.
> 
> 2. secondly: we should clarify that some system calls can be
>circumvented by using others, and hence people must be ready for
>that and not assume blacklists could ever be future-proof.
> 
> Regarding the syscall groupings: yes, the groups exist precisely to
> improve cases like this. That said, I think we should be careful not
> have an inflation of groups, and we should ask twice whether a group
> is really desirable before adding it. I'd argue in the open/openat
> case the case is not strong enough though: writing a filter
> blacklisting those is very difficult, as it means you cannot run
> programs with dynamic libraries (as loading those requires
> open/openat), which hence limits the applications very much and
> restricts its use to very few, very technical cases. In those case I
> have the suspicion the writer of the filters needs to know in very
> much detail what the semantics are anyway, and he hence isn't helped
> too much by this group.
> 
> Note that the @file-system group already includes both, so maybe
> that's a more adequate solution? (not usable for blacklisting though,
> only for whirelisting, realistically).
> 
> Hence, I would argue this is a documentation issue, not a bug
> really... Does that make sense?
Yes.

Linux has always been a moving target and in very many circumstances
this has been A Good Idea!
I guess I'm too much old school and try to keep to the principle of
least surprise.

Thanks for the (as ever polite!) response.

Josef
-- 
SUSE Linux GmbH
Maxfeldstrasse 5
90409 Nuernberg
Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Lennart Poettering
On Di, 28.05.19 13:50, Wiktor Kwapisiewicz (wik...@metacode.biz) wrote:

> Hi Zbyszek,
>
> On 28.05.2019 13:43, Zbigniew Jędrzejewski-Szmek wrote:
> > What kind of information are you after?
>
> One interesting statistic I'd like to see changing is the time when the
> watchdog was notified last.
>
> For example, there is Timeleft in this wdctl output [0]:
>
>   # wdctl
>   Identity:  iTCO_wdt [version 0]
>   Timeout:   30 seconds
>   Timeleft:   2 seconds
>
>   FLAG   DESCRIPTION   STATUS BOOT-STATUS
>   KEEPALIVEPING  Keep alive ping reply  0   0
>   MAGICCLOSE Supports magic close char  0   0
>   SETTIMEOUT Set timeout (in seconds)   0   0
>

The kernel devices are currently single-use only. Most of these fields
are exported via sysfs too however:

grep . /sys/class/watchdog/watchdog0/*

I think it might make sense if util-linux' wdctl would alternatively
use the sysfs API for querying these bits, it should be fairly easy to
add, please file a bug requesting that on util-linux github page!

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Zbigniew Jędrzejewski-Szmek
On Tue, May 28, 2019 at 01:50:53PM +0200, Wiktor Kwapisiewicz wrote:
> Hi Zbyszek,
> 
> On 28.05.2019 13:43, Zbigniew Jędrzejewski-Szmek wrote:
> >What kind of information are you after?
> 
> One interesting statistic I'd like to see changing is the time when
> the watchdog was notified last.
> 
> For example, there is Timeleft in this wdctl output [0]:
> 
>   # wdctl
>   Identity:  iTCO_wdt [version 0]
>   Timeout:   30 seconds
>   Timeleft:   2 seconds

This currently isn't exported by systemd, and there's even no log
message at debug level. I guess this could be exposed, but I don't
think it'd be very useful. If the watchdog ping works, most people
don't need to look at it. If it doesn't, the machine should reboot...

If this is just for debugging, you can do something like

  sudo strace -e ioctl -p 1

and look for WDIOC_KEEPALIVE.

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Lennart Poettering
On Di, 28.05.19 11:43, Josef Moellers (jmoell...@suse.de) wrote:

> Hi,
>
> We just had an issue with a partner who tried to filter out the "open"
> system call:
>
> . This may, in general, not be a very clever idea because how is one to
> load a shared library to start with, but this example has revealed
> something problematic ...
>   SystemCallFilter=~open
> The problem the partner had was that the filter just didn't work. No
> matter what he tried, the test program ran to completion.
>
> It took us some time to figure out what caused this:
> The test program relied on the fact that when it called open(), that the
> "open" system call would be used, which it doesn't any more. It uses the
> "openat" system call instead (*).
> Now it appears that this change is deliberate and so my question is what
> to do about these cases.
> Should one
> * also filter out "openat" if only "open" is required?
> * introduce a new group "@open" which filters both?
>
> I regard "SystemCallFilter" as a security measure and if one cannot rely
> on mechanisms any more, what good is such a feature?

This is a general problem: as the kernel is developed, new system
calls are added, and very often they provide redundat mechanisms to
earlier system calls. Thiis is the case with openat() vs. open(), and
for example very recently with pidfd_send_signal() vs. kill(). We will
always play catch-up with that unfortunately, it's an inherent problem
of the system call interface and applying blacklist filters to it, and
we are not going to solve that, I fear.

I think we should add docs about this case, underlining two things:

1. first of all: whitelist, don't blacklist! If so, all new system
   calls are blocked by default, and thus the problem goes away.

2. secondly: we should clarify that some system calls can be
   circumvented by using others, and hence people must be ready for
   that and not assume blacklists could ever be future-proof.

Regarding the syscall groupings: yes, the groups exist precisely to
improve cases like this. That said, I think we should be careful not
have an inflation of groups, and we should ask twice whether a group
is really desirable before adding it. I'd argue in the open/openat
case the case is not strong enough though: writing a filter
blacklisting those is very difficult, as it means you cannot run
programs with dynamic libraries (as loading those requires
open/openat), which hence limits the applications very much and
restricts its use to very few, very technical cases. In those case I
have the suspicion the writer of the filters needs to know in very
much detail what the semantics are anyway, and he hence isn't helped
too much by this group.

Note that the @file-system group already includes both, so maybe
that's a more adequate solution? (not usable for blacklisting though,
only for whirelisting, realistically).

Hence, I would argue this is a documentation issue, not a bug
really... Does that make sense?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Josef Moellers
On 28.05.19 12:25, Martin Wilck wrote:
> On Tue, 2019-05-28 at 11:43 +0200, Josef Moellers wrote:
>> Hi,
>>
>> We just had an issue with a partner who tried to filter out the
>> "open"
>> system call:
>>
>> . This may, in general, not be a very clever idea because how is one
>> to
>> load a shared library to start with, but this example has revealed
>> something problematic ...
>>  SystemCallFilter=~open
>> The problem the partner had was that the filter just didn't work. No
>> matter what he tried, the test program ran to completion.
>> It took us some time to figure out what caused this:
>> The test program relied on the fact that when it called open(), that
>> the
>> "open" system call would be used, which it doesn't any more. It uses
>> the
>> "openat" system call instead (*).
> 
> AFAIK, glibc hardly ever uses open(2) any more, and has been doing so
> fo
> r some time.

Yes, I found out the hard way ... testing and testing until i ran the
script through strace and found that fopen() doesn't use "open" any more :-(

>> Now it appears that this change is deliberate and so my question is
>> what
>> to do about these cases.
>> Should one
>> * also filter out "openat" if only "open" is required?
> 
> That looks wrong to me. Some people *might* want to filter open(2)
> only, and would be even more surprised than you are now if this
> would implicitly filter out openat(2) as well.

I agree. The suggestion was only mode for completeness.

>> * introduce a new group "@open" which filters both?
> 
> Fair, but then there are lots of XYat() syscalls which would need
> to be treated the same way.

Yes.

>> I regard "SystemCallFilter" as a security measure and if one cannot
>> rely
>> on mechanisms any more, what good is such a feature?
> 
> Have you seen this? https://lwn.net/Articles/738694/
> IMO this is a question related to seccomp design; "SystemCallFilter"
> is just a convenient helper for using seccomp.

It is indeed implemented that way!

Thanks,

Josef
-- 
SUSE Linux GmbH
Maxfeldstrasse 5
90409 Nuernberg
Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Wiktor Kwapisiewicz

Hi Zbyszek,

On 28.05.2019 13:43, Zbigniew Jędrzejewski-Szmek wrote:

What kind of information are you after?


One interesting statistic I'd like to see changing is the time when the 
watchdog was notified last.


For example, there is Timeleft in this wdctl output [0]:

  # wdctl
  Identity:  iTCO_wdt [version 0]
  Timeout:   30 seconds
  Timeleft:   2 seconds

  FLAG   DESCRIPTION   STATUS BOOT-STATUS
  KEEPALIVEPING  Keep alive ping reply  0   0
  MAGICCLOSE Supports magic close char  0   0
  SETTIMEOUT Set timeout (in seconds)   0   0

Kind regards,
Wiktor

[0]: https://karelzak.blogspot.com/2012/05/eject1-sulogin1-wdctl1.html

--
https://metacode.biz/@wiktor
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Zbigniew Jędrzejewski-Szmek
On Tue, May 28, 2019 at 12:59:27PM +0200, Wiktor Kwapisiewicz wrote:
> Hello,
> 
> I've enabled "RuntimeWatchdogSec=30" in /etc/systemd/system.conf
> (after reading excellent "systemd for Administrators" series [0]).
> 
> Before enabling that "wdctl" printed nice statistics but now it only
> informs that the "watchdog already in use, terminating." I guess
> this is obvious as systemd is using /dev/watchdog now but is there a
> way to get more statistics about watchdog from systemd?
> 
> Journal has only basic info that the setting is enabled:
> 
> $ journalctl | grep watchdog

journalctl --grep watchdog

> kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
> systemd[1]: Hardware watchdog 'iTCO_wdt', version 0
> systemd[1]: Set hardware watchdog to 30s.

What kind of information are you after?
It is possible to query the systemd state:
$ systemctl show |grep Watchdog
RuntimeWatchdogUSec=0
ShutdownWatchdogUSec=10min
ServiceWatchdogs=yes

... but that's essentially the same information that you got from the logs
already.

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] RFC: temporarily deactivating udev rules during coldplug

2019-05-28 Thread Lennart Poettering
On Di, 28.05.19 12:04, Martin Wilck (mwi...@suse.de) wrote:

> We are facing problems during udev coldplug on certain very big systems
> (read: > 1000 CPUs, several TiB RAM). Basically, the systems get
> totally unresponsive immediately after coldplug starts, and remain so
> for minutes, causing uevents to time out. Attempts to track it down
> have shown that access to files on tmpfs (e.g. /run/udev/db) may take
> a very long time. Limiting the maximum number of udev workers helps,
> but doesn't seem to solve all problems we are seeing.
>
> Among the things we observed was lots of activity running certain udev
> rules which are executed for every device. One such example is the
> "vpdupdate" rule on Linux-PowerPC systems:
>
> https://sourceforge.net/p/linux-diag/libvpd/ci/master/tree/90-vpdupdate.rules

I am sorry, but this rule is bad, it hurts just looking at it. I don't
think we need to optimize our code for rules tht are that
broad. Please work with the package in question to optimize things,
and use finer grained and less ridiculous rules... (also: what for
even? to maintain a timestamp???)

> Another one is a SUSE specific rule that is run on CPU- or memory-
> events
> (https://github.com/openSUSE/kdump/blob/master/70-kdump.rules.in).
> It is triggered very often on large systems that may have 1s of
> memory devices.

This one isn't much better.

Please fix the rules to not do crazy stuff like forking off process in
gazillions of cases...

if you insist on forking of a process for every event under the sun,
then yes, things will be slow, but what can I say, you broke it, you
get to keep the pieces...

> These are rules that are worthwhile and necessary in a fully running
> system to respond to actual hotplug events, but that need not be run
> during coldplug, in particular not 1000s of times in a very short time
> span.

Sorry, but these rules are just awful, please make them finer grained,
without running shell scripts every time. I mean, your complaint is
basically: shell scripting isn't scalable... but dah, of course it
isn't, and the fix is not to do that then...

For example, in the kdump case, just pull in a singleton service
asynchronously, via SYSTEMD_WANTS for example. And if you want a
timestamp of the last device, then udev keeps that anyway for you, in
the USEC_IITIALIZED, per device.

> The idea is implemented with a simple shell script and two unit
> files.

Sorry, but we are not adding new shell scripts that work around awful
shell scripts to systemd. Please fix the actual problems, and work
with the maintainers of the packages causing those problems to fix
them, don't glue a workaround atop an ugly hack.

Sorry, but this is not an OK approach at all!

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] How to get hardware watchdog status when using systemd

2019-05-28 Thread Wiktor Kwapisiewicz

Hello,

I've enabled "RuntimeWatchdogSec=30" in /etc/systemd/system.conf (after 
reading excellent "systemd for Administrators" series [0]).


Before enabling that "wdctl" printed nice statistics but now it only 
informs that the "watchdog already in use, terminating." I guess this is 
obvious as systemd is using /dev/watchdog now but is there a way to get 
more statistics about watchdog from systemd?


Journal has only basic info that the setting is enabled:

$ journalctl | grep watchdog

kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
systemd[1]: Hardware watchdog 'iTCO_wdt', version 0
systemd[1]: Set hardware watchdog to 30s.

Thank you in advance!

Kind regards,
Wiktor

[0]: http://0pointer.de/blog/projects/watchdog.html

--
https://metacode.biz/@wiktor
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] SystemCallFilter

2019-05-28 Thread Martin Wilck
On Tue, 2019-05-28 at 11:43 +0200, Josef Moellers wrote:
> Hi,
> 
> We just had an issue with a partner who tried to filter out the
> "open"
> system call:
> 
> . This may, in general, not be a very clever idea because how is one
> to
> load a shared library to start with, but this example has revealed
> something problematic ...
>   SystemCallFilter=~open
> The problem the partner had was that the filter just didn't work. No
> matter what he tried, the test program ran to completion.
> It took us some time to figure out what caused this:
> The test program relied on the fact that when it called open(), that
> the
> "open" system call would be used, which it doesn't any more. It uses
> the
> "openat" system call instead (*).

AFAIK, glibc hardly ever uses open(2) any more, and has been doing so
fo
r some time.

> Now it appears that this change is deliberate and so my question is
> what
> to do about these cases.
> Should one
> * also filter out "openat" if only "open" is required?

That looks wrong to me. Some people *might* want to filter open(2)
only, and would be even more surprised than you are now if this
would implicitly filter out openat(2) as well.

> * introduce a new group "@open" which filters both?

Fair, but then there are lots of XYat() syscalls which would need
to be treated the same way.

> I regard "SystemCallFilter" as a security measure and if one cannot
> rely
> on mechanisms any more, what good is such a feature?

Have you seen this? https://lwn.net/Articles/738694/
IMO this is a question related to seccomp design; "SystemCallFilter"
is just a convenient helper for using seccomp.

Cheers,
Martin


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] RFC: temporarily deactivating udev rules during coldplug

2019-05-28 Thread Martin Wilck
We are facing problems during udev coldplug on certain very big systems
(read: > 1000 CPUs, several TiB RAM). Basically, the systems get
totally unresponsive immediately after coldplug starts, and remain so
for minutes, causing uevents to time out. Attempts to track it down
have shown that access to files on tmpfs (e.g. /run/udev/db) may take
a very long time. Limiting the maximum number of udev workers helps,
but doesn't seem to solve all problems we are seeing.

Among the things we observed was lots of activity running certain udev
rules which are executed for every device. One such example is the
"vpdupdate" rule on Linux-PowerPC systems:

https://sourceforge.net/p/linux-diag/libvpd/ci/master/tree/90-vpdupdate.rules

Another one is a SUSE specific rule that is run on CPU- or memory-
events
(https://github.com/openSUSE/kdump/blob/master/70-kdump.rules.in). 
It is triggered very often on large systems that may have 1s of 
memory devices.

These are rules that are worthwhile and necessary in a fully running
system to respond to actual hotplug events, but that need not be run
during coldplug, in particular not 1000s of times in a very short time
span. 

Therefore I'd like to propose a scheme to deactivate certain rules
during coldplug. The idea involves 2 new configuration directories:

 /etc/udev/pre-trigger.d:

   "*.rules" files in this directory are copied to /run/udev/rules.d
   before starting "udev trigger". Normally these would be 0-byte files
   with a name corresponding to an actual rule file from
   /usr/lib/udev/rules.d - by putting them to /run/udev/rules.d,
   the original rules are masked.
   After "udev settle" finishes, either successfully or not, the
   files are removed from /run/udev/rules.d again.

 /etc/udev/post-settle.d:

   "*.post" files in this directory are executed after udev settle
   finishes. The intention is to create a "cumulative action", ideally
   equivalent to having run the masked-out rules during coldplug.
   This may or may not be necessary, depending on the rules being
   masked out. For the vpdupdate rule above, thus comes down to running
   "/bin/touch /run/run.vpdupdate". 

The idea is implemented with a simple shell script and two unit files.
The 2nd unit file is necessary because simply using systemd-udev-
settle's  "ExecStartPost" doesn't work - unmasking must be done even if
"udev settle" fails or times out. "ExecStopPost" doesn't work either,
we don't want to run this when systemd-udev-settle.service is stopped
after having been started successfully.

See details below. Comments welcome.
Also, would this qualify for inclusion in the systemd code base?

Martin


Shell script: /usr/lib/udev/coldplug.sh

#! /bin/sh
PRE_DIR=/etc/udev/pre-trigger.d
POST_DIR=/etc/udev/post-settle.d
RULES_DIR=/run/udev/rules.d

[ -d "$PRE_DIR" ] || exit 0
[ -d "$RULES_DIR" ] || exit 0

case $1 in
mask)
cd "$PRE_DIR"
for fl in *.rules; do
[ -e "$fl" ] || break
cp "$fl" "$RULES_DIR"
done
;;
unmask)
cd "$PRE_DIR"
for fl in *.rules; do
[ -e "$fl" ] || break
rm -f "$RULES_DIR/$fl"
done
;;
post)
[ -d "$POST_DIR" ] || exit 0
cd "$POST_DIR"
for fl in *.post; do
[ -e "$fl" ] || break
[ -x "$fl" ] || continue
./"$fl"
done
;;
*) echo "usage: $0 [mask|unmask|post]" >&2; exit 1;;
esac



Unit file: systemd-udev-pre-coldplug.service

[Unit]
Description=Mask udev rules before coldplug
DefaultDependencies=No
Conflicts=shutdown.target
Before=systemd-udev-trigger.service
Wants=systemd-udev-post-coldplug.service
ConditionDirectoryNotEmpty=/etc/udev/pre-trigger.d
ConditionPathIsDirectory=/run/udev/rules.d
ConditionFileIsExecutable=/usr/lib/udev/coldplug.sh

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/lib/udev/coldplug.sh mask
ExecStop=-/usr/lib/udev/coldplug.sh unmask

[Install]
WantedBy=sysinit.target



Unit file: systemd-udev-post-coldplug.service

[Unit]
Description=Reactivate udev rules after coldplug
DefaultDependencies=No
Conflicts=shutdown.target
After=systemd-udev-settle.service
ConditionDirectoryNotEmpty=/etc/udev/pre-trigger.d
ConditionPathIsDirectory=/run/udev/rules.d
ConditionFileIsExecutable=/usr/lib/udev/coldplug.sh

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/bin/systemctl stop systemd-udev-pre-coldplug.service
ExecStart=-/usr/lib/udev/coldplug.sh post

[Install]
WantedBy=sysinit.target


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] SystemCallFilter

2019-05-28 Thread Josef Moellers
Hi,

We just had an issue with a partner who tried to filter out the "open"
system call:

. This may, in general, not be a very clever idea because how is one to
load a shared library to start with, but this example has revealed
something problematic ...
SystemCallFilter=~open
The problem the partner had was that the filter just didn't work. No
matter what he tried, the test program ran to completion.

It took us some time to figure out what caused this:
The test program relied on the fact that when it called open(), that the
"open" system call would be used, which it doesn't any more. It uses the
"openat" system call instead (*).
Now it appears that this change is deliberate and so my question is what
to do about these cases.
Should one
* also filter out "openat" if only "open" is required?
* introduce a new group "@open" which filters both?

I regard "SystemCallFilter" as a security measure and if one cannot rely
on mechanisms any more, what good is such a feature?

Josef

(*) IMHO thereby breaking The Principle Of Least Surprise.
-- 
SUSE Linux GmbH
Maxfeldstrasse 5
90409 Nuernberg
Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel