Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-19 Thread Chen Jie
Hi,

2011/12/17 Michal Schmidt mschm...@redhat.com:
 Chen Jie wrote:
 Let me explain it further.
 First syslogv() in dbus may block, while rsyslog.service is starting
 and meanwhile the kernel socket buffer was full.
 Attachment syslog-test.c was a program simulates the situation.

 So while rsyslog.service is starting, on the systemd side:
 1. [ExecStartPre] stops systemd-kmsg-syslogd.service
 2. It now running bus_init() to publish its API to dbus
 3. Step 2 timeout, because dbus-daemon didn't reply in time.

 Why dbus-daemon didn't response quickly? Because it blocked on
 syslogv(), which was waiting for someone consumes the message(the
 kernel socket buffer was full), but sadly the consumer -- rsyslog
 didn't be started because systemd blocked.

 So systemd waits for dbus, and dbus waits for startup of rsyslog,
 rsyslog waits for systemd to start it.

 Thanks for debugging this.
 You are right. This kind of a deadlock is possible.

 Does the deadlock go away if you just modify rsyslog.service so that
 instead of stopping systemd-kmsg-syslogd in ExecStartPre it would
 do it in ExecStartPost?
Yeah, it do the trick.


 IMHO, We need to put functions that may block in separate threads,
 for
 example bus_init(), shutdown_connection(), log_meta().

 Oh, I'm sure this works, but I'm scared of the additional
 complexity of threads.
Agree, but IMHO, from the architectural point of view, smooth running
of systemd should not depend on quick response of outside
processes(rsyslogd, dbus).



Regards,
-- Chen Jie
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-19 Thread Michal Schmidt
- Original Message -
  Does the deadlock go away if you just modify rsyslog.service so
  that
  instead of stopping systemd-kmsg-syslogd in ExecStartPre it would
  do it in ExecStartPost?
 Yeah, it do the trick.

Great. People who are hitting this deadlock can use this as a temprorary 
workaround.

  IMHO, We need to put functions that may block in separate threads,
  for
  example bus_init(), shutdown_connection(), log_meta().
 
  Oh, I'm sure this works, but I'm scared of the additional
  complexity of threads.
 Agree, but IMHO, from the architectural point of view, smooth running
 of systemd should not depend on quick response of outside
 processes(rsyslogd, dbus).

Yes.
systemd at least partially defends itself from a hanging syslog
by setting SO_SNDTIMEO to 5 seconds. Maybe even that is too much and we
could just make the socket completely non-blocking and just fallback to kmsg
when we get EAGAIN.

I believe there is a way to connect and register to DBus fully
asynchronously too. I'm looking into this.

Michal
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-19 Thread Michal Schmidt
Michal Schmidt wrote:
 systemd at least partially defends itself from a hanging syslog
 by setting SO_SNDTIMEO to 5 seconds. Maybe even that is too much and
 we could just make the socket completely non-blocking and just
 fallback to kmsg when we get EAGAIN.
 
 I believe there is a way to connect and register to DBus fully
 asynchronously too. I'm looking into this.

It's now implemented in git. Please give it a try.

Michal
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-19 Thread Chen Jie
Hi,

2011/12/20 Michal Schmidt mschm...@redhat.com:
 Michal Schmidt wrote:
 systemd at least partially defends itself from a hanging syslog
 by setting SO_SNDTIMEO to 5 seconds. Maybe even that is too much and
 we could just make the socket completely non-blocking and just
 fallback to kmsg when we get EAGAIN.

 I believe there is a way to connect and register to DBus fully
 asynchronously too. I'm looking into this.

 It's now implemented in git. Please give it a try.
Thank you for the work.

Just as the commit message mentioned --
dbus_connection_open_private() itself is still synchronous, I did
find some dbus_connection_open_private() failures in my debugging
before(though at a very low occurrence rate). -- that's why I chose
the thread way.

Also, the shutdown_connection() may block in theory, since it invokes
dbus_connection_flush()



Regards,
-- Chen Jie
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-16 Thread Michal Schmidt
Chen Jie wrote:
 Let me explain it further.
 First syslogv() in dbus may block, while rsyslog.service is starting
 and meanwhile the kernel socket buffer was full.
 Attachment syslog-test.c was a program simulates the situation.
 
 So while rsyslog.service is starting, on the systemd side:
 1. [ExecStartPre] stops systemd-kmsg-syslogd.service
 2. It now running bus_init() to publish its API to dbus
 3. Step 2 timeout, because dbus-daemon didn't reply in time.
 
 Why dbus-daemon didn't response quickly? Because it blocked on
 syslogv(), which was waiting for someone consumes the message(the
 kernel socket buffer was full), but sadly the consumer -- rsyslog
 didn't be started because systemd blocked.
 
 So systemd waits for dbus, and dbus waits for startup of rsyslog,
 rsyslog waits for systemd to start it.

Thanks for debugging this.
You are right. This kind of a deadlock is possible.

Does the deadlock go away if you just modify rsyslog.service so that
instead of stopping systemd-kmsg-syslogd in ExecStartPre it would
do it in ExecStartPost?

 IMHO, We need to put functions that may block in separate threads,
 for
 example bus_init(), shutdown_connection(), log_meta().

Oh, I'm sure this works, but I'm scared of the additional
complexity of threads. 

Michal
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-14 Thread Chen Jie
2011/12/2 Chen Jie ch...@lemote.com:
 Hey,
 After did many experiments, I found it was one iteration of
 manager_loop spent ~25s:
 * dbus.service: service_enter_start_post()
  - service_enter_running()
  -- service_set_state(s, SERVICE_RUNNING)
  --- unit_notify()
   bus_init() - bus_init_api()
 bus_ini_api()  failed to add_match for 'NameOwnerChanged' (reason:
 timeout, and the default timeout value in dbus lib was 25s)

 In another hand, dbus-daemon failed to reply in time because it tried
 to log a message, meanwhile rsyslog.service was starting:
 1. In ExecStartPre stage(/bin/systemctl stop
 systemd-kmsg-syslogd.service), systemctl failed to receive replies in
 time because the manager_loop of systemd was blocked!   or
 2. After ExecStartPre stage, manager_loop was blocked for 25s, and
 then do ExecStart.

 Then dbus-daemon prompted Failed to activate service
 'org.freedesktop.systemd1': timed out, and then entered a busy-loop
 in pending_activation_failed()
 http://cgit.freedesktop.org/dbus/dbus/tree/bus/activation.c?id=7dbfa45153167ee53487630f8109d32860cb6263#n1280
 (An _dbus_return_val_if_fail(reply_serial != 0, FALSE)  failure was
 observed in gdb, still don't how this happened)

Let me explain it further.
First syslogv() in dbus may block, while rsyslog.service is starting
and meanwhile the kernel socket buffer was full.
Attachment syslog-test.c was a program simulates the situation.

So while rsyslog.service is starting, on the systemd side:
1. [ExecStartPre] stops systemd-kmsg-syslogd.service
2. It now running bus_init() to publish its API to dbus
3. Step 2 timeout, because dbus-daemon didn't reply in time.

Why dbus-daemon didn't response quickly? Because it blocked on
syslogv(), which was waiting for someone consumes the message(the
kernel socket buffer was full), but sadly the consumer -- rsyslog
didn't be started because systemd blocked.

So systemd waits for dbus, and dbus waits for startup of rsyslog,
rsyslog waits for systemd to start it.

IMHO, We need to put functions that may block in separate threads, for
example bus_init(), shutdown_connection(), log_meta().

Attachment 0001-Add-new-API-manager_add_work_thread.patch and
0002-dbus.c-hooks-to-the-manager_add_work_thread-API.patch are two
patches that make bus_init run in thread and hence not block the
manager_loop.



Regards,
-- Chen Jie
#include stdio.h
#include unistd.h
#include syslog.h
#include sys/time.h
#include stdint.h

int main(void)
{
	int c, i;
	struct timeval tv1, tv2;

	openlog(slog, LOG_PID | LOG_PERROR, LOG_DAEMON);

	printf(Now edit /lib/systemd/system/rsyslog.service.\n
	   Add a line \ExecStartPre=/bin/sleep 30\ just after the original ExecStartPre\n
	   Run \systemctl daemon-reload  systemctl restart rsyslog.service\\n
	   Then press enter to continue? See how many time it may spend on sending a log\n);
	c = getc(stdin);

	printf(Starting the test...\n);
	for (i = 0; i  20; i++) {
		gettimeofday(tv1, NULL);
		syslog(LOG_INFO, iteration %d, i);
		gettimeofday(tv2, NULL);

		if (tv2.tv_sec - tv1.tv_sec  2) {
			printf(In %d iteration: syslog() spend more than 2 secs (%lu,%lu) - (%lu,%lu)\n,
			   i, tv1.tv_sec, tv1.tv_usec, tv2.tv_sec, tv2.tv_usec);
			break;
		}
	}
	printf(Test finished.\n);

	closelog();

	return 0;
}
From 78a56b263844b1db7fb29a7f6c0c4a4b0d07ad80 Mon Sep 17 00:00:00 2001
From: cee1 fykc...@gmail.com
Date: Wed, 14 Dec 2011 14:25:39 +0800
Subject: [PATCH 1/2] Add new API: manager_add_work_thread()

The work thread was used to finish work in a seperate thread.
It can be specified a notify function, which will be called in manager_loop,
When process watch of type WATCH_WORK_THREAD.
---
 src/manager.c |  197 -
 src/manager.h |   22 ++-
 2 files changed, 217 insertions(+), 2 deletions(-)

diff --git a/src/manager.c b/src/manager.c
index 67a..387e130 100644
--- a/src/manager.c
+++ b/src/manager.c
@@ -36,6 +36,7 @@
 #include sys/types.h
 #include sys/stat.h
 #include dirent.h
+#include pthread.h
 
 #ifdef HAVE_AUDIT
 #include libaudit.h
@@ -71,6 +72,8 @@
 #define NOTIFY_SOCKET_SYSTEM /run/systemd/notify
 #define NOTIFY_SOCKET_USER @/org/freedesktop/systemd1/notify
 
+static void wait_all_work_threads_done(Manager *m);
+
 static int manager_setup_notify(Manager *m) {
 union {
 struct sockaddr sa;
@@ -217,6 +220,27 @@ static int manager_setup_signals(Manager *m) {
 return 0;
 }
 
+static int manager_setup_work_threads(Manager *m) {
+int p[2] = { -1, -1 };
+struct epoll_event ev;
+
+if (pipe2(p, O_NONBLOCK | O_CLOEXEC)  0)
+return -errno;
+
+m-work_thread_watch.type = WATCH_WORK_THREAD;
+m-work_thread_watch.fd = p[0];
+m-work_thread_notify_fd = p[1];
+
+zero(ev);
+ev.events = EPOLLIN;
+ev.data.ptr = m-work_thread_watch;
+
+if (epoll_ctl(m-epoll_fd, EPOLL_CTL_ADD, m-work_thread_watch.fd, ev)  0)
+  

Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-01 Thread tang ke
nobody reply?
于 2011年11月30日 17:49, Chen Jie 写道:
 Hi all,

 I found my system boot stops at terminal login screen sometimes under
 systemd v29 + dbus 1.4.16. I tried to login, after supplied username
 and password, it blocked with a blinking cursor.

 A primary debug shows:
 * dbus-daemon was doing a busy-loop in pending_activation_failed()
 (see http://lists.freedesktop.org/archives/dbus/2011-November/014790.html)
 * There was a message Failed to activate service
 'org.freedesktop.systemd1': timed out in syslog (see the full syslog
 at 
 http://lists.freedesktop.org/archives/dbus/attachments/2025/e9c204bb/attachment-0001.obj)

 After I added more log funcs with gettimeofday generated timestamp
 (tv_sec, tv_usec), I found:
 * bus_init_api() calling dbus_bus_add_match failed with timeout -- the
 function starts at (1322641754,371276), ends at  (1322641779,396943)
 * In dbus side, I found it spends ~25s to send a log to syslog:
 timestamp: (1322641754,372197)
 bus_context_log (activation-context,
DBUS_SYSTEM_LOG_INFO, Activating
 systemd to hand-off: service name='%s' unit='%s',
service_name,
entry-systemd_service);//
 bus_activation_activate_service() @bus/activation.c
 timestamp: (1322641780,555)

 I guess there was something wrong with startup of rsyslog.service, and I 
 found:
 * (1322641752,26061): [rsyslog.service]service_enter_start_pre()
 * (1322641753,382491): systemd-kmsg-syslogd[705]: systemd-kmsg-syslogd
 stopped as pid 705
 * (1322641779,692927): [rsyslog.service]service_enter_start()

 Start rsyslog.service will stop systemd-kmsg-syslogd.service first, so
 I added more log funcs in systemctl:
 * systemctl[1216]: start_unit_one(): (1322641753,229083)-(1322641753,383118)
 * systemctl[1216]: wait_filter() (1322641753,465392)Got D-Bus request:
 org.freedesktop.systemd1.Manager.JobRemoved() on
 /org/freedesktop/systemd1
 * systemctl[1216]: wait_for_jobs(): (1322641753,468987) finished
 while (!set_isempty(s) dbus_connection_read_write_dispatch(bus,
 -1))
 systemctl stop systemd-kmsg-syslogd.service didn't much time.


 The content of rsyslog.service:
 [Unit]
 Description=System Logging Service

 [Service]
 ExecStartPre=/bin/systemctl stop systemd-kmsg-syslogd.service
 ExecStart=/usr/sbin/rsyslogd -n -c5
 Sockets=syslog.socket

 [Install]
 WantedBy=multi-user.target


 Any idea?

 BTW, the gap between systemd-kmsg-syslogd.service stopped and rsyslog
 not ready will make it lost some log messages.



 Regards,
 --- Chen Jie
 ___
 dbus mailing list
 d...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dbus



-- 
Best Regards


Tang Ke (Application develop of software department)
Tel:0086-512-52308628 Fax:0086-512-52308688
Phone:18962393077
E-mail:ta...@lemote.com  MSN:ta...@lemote.com, mumut...@gmail.com
Web: http://www.lemote.com http://www.lemote.com/
JiangSu ZhongKe Lemote Technology Co.,Ltd
MengLan Industry Park,YuShan,ChangShu City,JiangSu,China

attachment: tangk.vcf___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-12-01 Thread Chen Jie
Hi,

Thanks for the reply.

2011/11/30 Michal Schmidt mschm...@redhat.com:
 On 11/30/2011 10:49 AM, Chen Jie wrote:

 (see the full syslog at

 http://lists.freedesktop.org/archives/dbus/attachments/2025/e9c204bb/attachment-0001.obj)


 The kernel modules failing to load indicate a problem with your kernel
 installation. Please sort this problem out first before attempting to debug
 anything else.
I didn't install the modules of matched version for convenient. I've
built-in basic modules, and this problem can be easily reproduced on a
sanity kernel (just do a repeat reboot).

I added more logs, and found two paths may cause the
bus_context_log() to spend ~25s -- then Failed to add_match for
'NameOwnerChanged' in bus_init_api():
1. ExecStartPre=/bin/systemctl stop systemd-kmsg-syslogd.service
spent too much time in need_daemon_reload(), it encountered timeout at
http://cgit.freedesktop.org/systemd/tree/src/systemctl.c?id=v29#n1080

2. In http://cgit.freedesktop.org/systemd/tree/src/manager.c?id=v29#n2051,
a previous UNIT_VTABLE(u)-sigchld_event spent too much time
((1322740559, 397870)-(1322740585,528000))



 BTW, the gap between systemd-kmsg-syslogd.service stopped and rsyslog
 not ready will make it lost some log messages.


 No, the socket remains open all the time and the messages will be buffered.
Got it, I begin to remember the socket-activation mechanism. But I do
lost some messages in systemctl(ExecStartPre of rsyslog.service), and
after I adding a log_set_target(LOG_TARGET_KMSG), the messages appear
in syslog. Maybe log_parse_environment() set log target to 'console'?



Regards,
-- Chen Jie
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-11-30 Thread Chen Jie
Hi all,

I found my system boot stops at terminal login screen sometimes under
systemd v29 + dbus 1.4.16. I tried to login, after supplied username
and password, it blocked with a blinking cursor.

A primary debug shows:
* dbus-daemon was doing a busy-loop in pending_activation_failed()
(see http://lists.freedesktop.org/archives/dbus/2011-November/014790.html)
* There was a message Failed to activate service
'org.freedesktop.systemd1': timed out in syslog (see the full syslog
at 
http://lists.freedesktop.org/archives/dbus/attachments/2025/e9c204bb/attachment-0001.obj)

After I added more log funcs with gettimeofday generated timestamp
(tv_sec, tv_usec), I found:
* bus_init_api() calling dbus_bus_add_match failed with timeout -- the
function starts at (1322641754,371276), ends at  (1322641779,396943)
* In dbus side, I found it spends ~25s to send a log to syslog:
timestamp: (1322641754,372197)
bus_context_log (activation-context,
   DBUS_SYSTEM_LOG_INFO, Activating
systemd to hand-off: service name='%s' unit='%s',
   service_name,
   entry-systemd_service);//
bus_activation_activate_service() @bus/activation.c
timestamp: (1322641780,555)

I guess there was something wrong with startup of rsyslog.service, and I found:
* (1322641752,26061): [rsyslog.service]service_enter_start_pre()
* (1322641753,382491): systemd-kmsg-syslogd[705]: systemd-kmsg-syslogd
stopped as pid 705
* (1322641779,692927): [rsyslog.service]service_enter_start()

Start rsyslog.service will stop systemd-kmsg-syslogd.service first, so
I added more log funcs in systemctl:
* systemctl[1216]: start_unit_one(): (1322641753,229083)-(1322641753,383118)
* systemctl[1216]: wait_filter() (1322641753,465392)Got D-Bus request:
org.freedesktop.systemd1.Manager.JobRemoved() on
/org/freedesktop/systemd1
* systemctl[1216]: wait_for_jobs(): (1322641753,468987) finished
while (!set_isempty(s) dbus_connection_read_write_dispatch(bus,
-1))
systemctl stop systemd-kmsg-syslogd.service didn't much time.


The content of rsyslog.service:
[Unit]
Description=System Logging Service

[Service]
ExecStartPre=/bin/systemctl stop systemd-kmsg-syslogd.service
ExecStart=/usr/sbin/rsyslogd -n -c5
Sockets=syslog.socket

[Install]
WantedBy=multi-user.target


Any idea?

BTW, the gap between systemd-kmsg-syslogd.service stopped and rsyslog
not ready will make it lost some log messages.



Regards,
--- Chen Jie
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd+dbus: system boot stops at terminal login screen sometimes

2011-11-30 Thread Michal Schmidt

On 11/30/2011 10:49 AM, Chen Jie wrote:

(see the full syslog at
http://lists.freedesktop.org/archives/dbus/attachments/2025/e9c204bb/attachment-0001.obj)


The kernel modules failing to load indicate a problem with your kernel 
installation. Please sort this problem out first before attempting to 
debug anything else.



BTW, the gap between systemd-kmsg-syslogd.service stopped and rsyslog
not ready will make it lost some log messages.


No, the socket remains open all the time and the messages will be buffered.

Michal
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel