[systemd-devel] [PATCH] systemd-bootchart: Prevent closing random file descriptors

2015-03-28 Thread Alexander Sverdlin

If the kernel has no CONFIG_SCHED_DEBUG option set, systemd-bootchart produces
empty .svg file. The reason for this is very fragile file descriptor logic in
log_sample() and main() (/* do some cleanup, close fd's */ block). There are
many places where file descriptors are closed on failure (missing SCHED_DEBUG
provokes it), but there are several problems with it:
 - following iterations in the loop see that the descriptor is non zero and do
   not open the corresponding file again;
 - "some cleanup" code closes already closed files and the descriptors are 
reused
   already, in particular for resulting .svg file;
 - static "vmstat" and "schedstat" variables in log_sample() made the situation
   even worse.

These are the strace fragments:

[...]
close(7)= -1 EBADF (Bad file descriptor)
close(-1)   = -1 EBADF (Bad file descriptor)
pread(7, 0xbea60a2c, 4095, 0)   = -1 EBADF (Bad file descriptor)
close(7)= -1 EBADF (Bad file descriptor)
close(-1)   = -1 EBADF (Bad file descriptor)
pread(7, 0xbea60a2c, 4095, 0)   = -1 EBADF (Bad file descriptor)
close(7)= -1 EBADF (Bad file descriptor)
close(-1)   = -1 EBADF (Bad file descriptor)
getdents64(4, /* 0 entries */, 32768)   = 0
clock_gettime(CLOCK_MONOTONIC, {24, 783843501}) = 0
nanosleep({0, 5221792}, NULL)   = 0
clock_gettime(CLOCK_MONOTONIC, {24, 789726835}) = 0
lseek(4, 0, SEEK_SET)   = 0
pread(5, "nr_free_pages 52309\nnr_alloc_bat"..., 4095, 0) = 685
pread(6, "version 15\ntimestamp 4294939775\n"..., 4095, 0) = 86
getdents64(4, /* 99 entries */, 32768)  = 2680
pread(7, 0xbea60a2c, 4095, 0)   = -1 EBADF (Bad file descriptor)
close(7)= -1 EBADF (Bad file descriptor)
close(-1)   = -1 EBADF (Bad file descriptor)
pread(8, 0xbea60a2c, 4095, 0)   = -1 EBADF (Bad file descriptor)
close(8)= -1 EBADF (Bad file descriptor)
close(-1)   = -1 EBADF (Bad file descriptor)
pread(9, 0xbea60a2c, 4095, 0)   = -1 EBADF (Bad file descriptor)
close(9)= -1 EBADF (Bad file descriptor)
[...]

where it obviously tries to close same and reused decriptors many times, also
passing return code "-1" instead of descriptor...

[...]
close(7)= -1 EBADF (Bad file descriptor)
close(-1)   = -1 EBADF (Bad file descriptor)
pipe2([7, 8], O_CLOEXEC)= 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0xb6fd0068) = 192
close(8)= 0
fcntl64(7, F_SETFD, 0)  = 0
fstat64(7, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb6fd2000
read(7, "[0.074507] calling  vfp_init"..., 4096) = 4096
[...]
read(7, "s)\n[6.228910] UBIFS: reserve"..., 4096) = 4096
read(7, "trary Executable File Formats Fi"..., 4096) = 1616
read(7, "", 4096)   = 0
close(7)= 0
wait4(192, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 192
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=192, si_uid=0, 
si_status=0, si_utime=9, si_stime=9} ---
munmap(0xb6fd2000, 4096)= 0
fstat64(7, 0xbea62850)  = -1 EBADF (Bad file descriptor)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb6fce000
write(7, "http://www.w3.org/Graphics/SVG/";..., 52) = -1 EBADF (Bad file 
descriptor)
[...]

here .svg file had descriptor 7, but was closed by log_sample(), then even 
reused
to read /proc entry and finally systemd-bootchart even tries to write .svg 
file, but
descriptor is long time lost already.

Fix it by zeroing all the closed descriptors immediately, this would repair
existing caching of open files and clean-up strategy.

The fix is important even with CONFIG_SCHED_DEBUG option enabled, because very
first failure to open /proc//* because process exited will result in some
other victim descriptor being closed later and will therefore disturb the whole
collected statistics.
---
 src/bootchart/store.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/bootchart/store.c b/src/bootchart/store.c
index dfa681f..8525f62 100644
--- a/src/bootchart/store.c
+++ b/src/bootchart/store.c
@@ -152,6 +152,7 @@ void log_sample(int sample, struct list_sample_data **ptr) {
 n = pread(vmstat, buf, sizeof(buf) - 1, 0);
 if (n <= 0) {
 close(vmstat);
+vmstat = 0;
 return;
 }
 buf[n] = '\0';
@@ -184,6 +185,7 @@ vmstat_next:
 n = pread(schedstat, buf, sizeof(buf) - 1, 0);
 if (n <= 0) {
   

Re: [systemd-devel] [systemd-commits] Makefile.am src/shared src/timedate

2015-03-28 Thread Matthias Urlichs
Hi,

Kay Sievers:
> If the system's time zone changes, or the time is adjusted manually,
> we just re-arm all timers, and all should be fine.
> 
> I see no need or use to support explicit triggers on the event of DST
> changes, the system or calendar time support just does not need them.
> 
So, in four hours (as I write this) there will be no 02:* hour for my 02:30
logfile cleanup job to run in. Does my system run it anyway? How do I
change that? (I can see use cases for both options.)

Same thing in the fall.

An explicit trigger can do any number of things to help me prevent problems.
What's even more important IMHO is the presence of a nice "#OnDSTChange="
remark in the spec template, because it teaches admins to actually *think*
about what will happen when DST changes, instead of simply ignoring the
problem and hoping that nothing 'interesting' will happen.

DST changes more often than physically moving a machine to another
timezone, for the vast majority of computers out there. Thus, frankly, it
seems strange to advocate for support of the latter but not the former use
case.

-- 
-- Matthias Urlichs
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH 2/2] macro: allow assert_se() assertions to also be optimized out when NDEBUG is set

2015-03-28 Thread Djalal Harouni
On Fri, Mar 27, 2015 at 04:16:55PM +0100, Tom Gundersen wrote:
> On Fri, Mar 27, 2015 at 2:04 PM, Djalal Harouni  wrote:
> > Hi Shawn,
> >
> > On Thu, Mar 26, 2015 at 11:21:54PM -0700, Shawn Landden wrote:
> >> On Thu, Mar 26, 2015 at 5:47 PM, Djalal Harouni  wrote:
> >> > On Fri, Mar 27, 2015 at 12:30:53AM +0100, Tom Gundersen wrote:
> >> >> On Thu, Mar 26, 2015 at 9:19 AM, Lennart Poettering
> >> >>  wrote:
> >> >> > On Tue, 24.03.15 11:11, Shawn Landden (sh...@churchofgit.com) wrote:
> >> >> >
> >> >> >> Will result in slightly smaller binaries, and cuts out the branch, 
> >> >> >> even if
> >> >> >> the expression is still executed.
> >> >> >
> >> >> > I am sorry, but the whole point of assert_se() is that it has side
> >> >> > effects. That's why it is called that way (the "se" suffix stands for
> >> >> > "side effects"), and is different from assert(). You are supposed to
> >> >> > use it whenever the code enclosed in it shall not be suppressed if
> >> >> > NDEBUG is defined. This patch of yours breaks that whole logic!
> >> >>
> >> >> Hm, am I reading the patch wrong? It looks good to me. It is simply
> >> >> dropping the branching and logging in the NDEBUG case, but the
> >> >> expression itself is still evaluated.
> >> > Yep but it seems that abort() will never be called,
> >> > log_assert_failed() => abort()
> >> >
> >> > And the logging mechanism is also one of those side effects. IMO unless
> >> > there are real valid reasons for any change to these macors... changing
> >> > anything here will just bring more bugs that we may never notice.
> >> >
> >> Those are the side effects of assert(), the side effects of the
> >> expression are still evaluated.
> > Yes but there are also the following points:
> > * assert() is simple, assert_se() is more complex and it may modify other
> > global sate. I don't think that we can break down side effects of
> > assert_se() into:
> >  1) side effects of assert()
> >  2) side effects of other expressions.
> >
> > And then remove that part 1)
> >
> >
> > So given the fact that asser_se() do not depend on NDEBUG, did you
> > consider the following:
> >
> > * You don't have control over what the expression do, it may just call
> >   abort() or exit() in case of fatal errors, so even if you remove the
> >   explicit abort() call, expressions may just call it.
> 
> If expr() aborts or exits, then it really does not matter whether we
> abort() explicitly or not, so I don't see what this has to do with
> this patch. Sure, even with NDEBUG the program may abort if that is
> called explicitly...
Hmm If expr() aborts or do some other complex things on its own then it
should not be used with assert(), right ? that comment was in regard of
this.

Since it aborts this is a side effect, and perhaps followup instructions
depend on the state if it is correct or not otherwise things will break
later. If you use assert() here and if NDEBUG is set, assert will just be
a nop and everything should continue to work.

The patch seems simple but it is introducing *new* semantics for
assert_se() if NDEBUG is set that current callers do not expect.


> > * Some expressions were perhaps depending on the logging mechanism which
> >   is part of the side effect. Callers to assert_se() are perhaps using
> >   it for a reason, are you sure that you are not introducing a
> >   regression here ?! it will be difficult to debug the error if we don't
> >   have logs.
> 
> As far as I can tell all the relevant logging functions are pure. They
> should not interact with global state, but maybe I'm missing
> something?
Hmm, we are opening, closing and logging to file descriptors and virtual
logging devices...

And callers expect to find a log message why the function failed.

> > * Current expression may modify/interact with a global state which may
> >   cause a fatal error, and if the caller wants to know if that failed,
> >   then abort(), your patch just introduced a regression, without the
> >   explicit abort(), all callers now have to call abort().
> 
> Yeah, this is the point I think. I still think the patch makes sense
> though, it really don't see why there should be a distinction between
> assert_se() and assert() when it comes to logging and aborting.
Yes, perhaps if this was the case from the start! but currently
assert_se() depends on the aborting mechanism which is not the case of
assert().


> If assert_se() fails it indicates we may have messed up the global
> state, and if assert() fails it indicates that the global state may be
> messed up (by someone else). Either way the global state is
> potentially messed up and not logging/aborting seems as (un)safe in
> both situations.
Indeed, yes.


> Another way of seeing it, intuitively I don't see why we should
> distinguish between
> 
> assert_se(foo() >= 0);
> 
> and
> 
> r = foo();
> assert(r >= 0); => nop
Yes sure, but since assert_se() did not interact with NDEBUG, to do this
you have to modify _all_ callers, the second assert() w

[systemd-devel] systemd "looping too fast" after an automatic attempt to restart anacron

2015-03-28 Thread Alberto Bertogli

Hi!

I noticed by chance that systemd was using ~15% of a CPU in my laptop,
according to top.

This is on Debian testing, systemd 215-12.

The machine has been up since January, but this started to happen
earlier today, while I was asleep, and it seems to be correlated with an
automatic restart of anacron.

The 3.18.3 kernel was built from vanilla sources and has never had
problems with systemd so far.

I asked on #systemd and they suggested me to post here, following up
with the information requested at
http://lists.freedesktop.org/archives/systemd-devel/2015-February/028541.html
(which seems to be a similar problem, if not the same).

So below I paste some general information that I sent on IRC, the gdb
output as requested in the email from February, and some information about the
suspicious anacron restart.


I hope this helps! I won't restart systemd for a day or two (unless it
gets worse, I can spare 15% of a CPU), so if there's any other
information you need, please let me know.

Thanks!
Alberto


- 8< - 8< - 8< - 8< - 8< -

top reports systemd (pid/tid 1) consuming ~15% of a CPU all the time.

vmstat shows high CPU usage with almost nothing running (not even a
browser) and a lot of context switches:

procs ---memory-- ---swap-- -io -system-- --cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 1  0  0 1775176 158244 126783200 0 0  341 45293  5  3 92  
0  0
 0  0  0 1775176 158244 126783200 0 0  352 50775  5  3 93  
0  0
 2  0  0 1775920 158244 126783200 080  341 16583  4  2 93  
1  0
 0  0  0 1774796 158244 126783200 0 0  330 32687  4  2 94  
0  0
 0  0  0 1774812 158244 126783600 0 0  337 50805  5  3 92  
0  0


$ tail /var/log/syslog
Mar 28 09:51:14 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:15 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:16 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:17 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:19 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:20 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:21 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:22 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:24 hostname systemd[1]: Looping too fast. Throttling execution a 
little.
Mar 28 09:51:25 hostname systemd[1]: Looping too fast. Throttling execution a 
little.


An strace shows it quite busy too, doing the same loop over an over. Here's
a snippet:

1 09:46:32 timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={0, 0}, 
it_value={5864380, 69834}}, NULL) = 0
1 09:46:32 epoll_wait(4, {{EPOLLOUT, {u32=2552835776, 
u64=139752198713024}}}, 41, 0) = 1
1 09:46:32 clock_gettime(CLOCK_BOOTTIME, {5864326, 657137453}) = 0
1 09:46:32 recvmsg(17, 0x7fff5f588a80, 
MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily 
unavailable)
1 09:46:32 timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={0, 0}, 
it_value={5856161, 19834}}, NULL) = 0
1 09:46:32 epoll_wait(4, {{EPOLLOUT, {u32=2552835776, 
u64=139752198713024}}, {EPOLLIN, {u32=3, u64=3}}}, 41, 0) = 2
1 09:46:32 clock_gettime(CLOCK_BOOTTIME, {5864326, 657236489}) = 0
1 09:46:32 read(12, "\1\0\0\0\0\0\0\0", 8) = 8
1 09:46:32 recvmsg(17, 0x7fff5f588a80, 
MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily 
unavailable)
1 09:46:32 timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={0, 0}, 
it_value={5864380, 69834}}, NULL) = 0
1 09:46:32 epoll_wait(4, {{EPOLLOUT, {u32=2552835776, 
u64=139752198713024}}}, 41, 0) = 1
1 09:46:32 clock_gettime(CLOCK_BOOTTIME, {5864326, 657356895}) = 0
1 09:46:32 recvmsg(17, 0x7fff5f588a80, 
MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily 
unavailable)
1 09:46:32 timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={0, 0}, 
it_value={5856161, 19834}}, NULL) = 0
1 09:46:32 epoll_wait(4, {{EPOLLOUT, {u32=2552835776, 
u64=139752198713024}}, {EPOLLIN, {u32=3, u64=3}}}, 41, 0) = 2
1 09:46:32 clock_gettime(CLOCK_BOOTTIME, {5864326, 657456514}) = 0
1 09:46:32 read(12, "\1\0\0\0\0\0\0\0", 8) = 8
1 09:46:32 recvmsg(17, 0x7fff5f588a80, 
MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily 
unavailable)
1 09:46:32 timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={0, 0}, 
it_value={5864380, 69834}}, NULL) = 0
1 09:46:32 epoll_wait(4, {{EPOLLOUT, {u32=2552835776, 
u64=139752198713024}}}, 41, 0) = 1
1 09:46:32 clock_gettime(CLOCK_BOOTTIME, {5864326, 657575831}) = 0
1 09:46:32 recvmsg(17, 0x7fff5f588a80,