Re: [Vserver] How to use the sched_pause flag (Host hangs)
On Tuesday 28 March 2006 15:25, Herbert Poetzl wrote: On Tue, Mar 28, 2006 at 02:48:11PM +0200, Andreas Baetz wrote: # chcontext --xid 400 --flag 0x0400 -- ps ax hangs forever vps on the host shows state H for the PID, the PID cannot be killed. # chcontext --xid 400 --flag 0x0100 -- ps ax does not unset the sched_pause flag, hangs too after a few more tries, the host hangs completely and has to be rebootet, nothing works except kernel magic sysRq-b If that _really_ happens, you should check for a kernel trace, as this would mean that the kernel had some issues somewhere (well it's an older kernel) I compiled a new kernel from kernel.org, only patched with patch-2.6.16-vs2.0.2-rc14.diff host::~# vserver-info Versions: Kernel: 2.6.16.1.060328 VS-API: 0x00020001 util-vserver: 0.30.209; Jan 8 2006, 12:24:41 Features: CC: gcc, gcc (GCC) 4.0.3 20051201 (prerelease) (Debian 4. 0.2-5) CXX: g++, g++ (GCC) 4.0.3 20051201 (prerelease) (Debian 4. 0.2-5) CPPFLAGS: '' CFLAGS: '-Wall -g -O2 -std=c99 -Wall -pedantic -W -funit-at- a-time' CXXFLAGS: '-g -O2 -ansi -Wall -pedantic -W -fmessage-length=0 - funit-at-a-time' build/host: i486-pc-linux-gnu/i486-pc-linux-gnu Use dietlibc: yes Build C++ programs: yes Build C99 programs: yes Available APIs: compat,v11,v13,fscompat,net,oldproc,olduts ext2fs Source: e2fsprogs syscall(2) invocation: alternative vserver(2) syscall#: 273/glibc Paths: prefix: /usr sysconf-Directory: /etc cfg-Directory: /etc/vservers initrd-Directory: $(sysconfdir)/init.d pkgstate-Directory: /var/run/vservers vserver-Rootdir: /var/lib/vservers I tried # chcontext --xid 4004 --flag 0x400 -- ps ax It immediately hang the host. When it hangs, the host is not pingable from outside, Mouse doesn't work, Ctrl-Alt-F1 etc. doesn't work Some log entries: Mar 29 12:11:56 host kernel: SysRq : SAK Mar 29 12:11:56 host kernel: SAK: killed process 3525 (Xorg): p-signal-session==tty-session Mar 29 12:12:04 host kernel: SysRq : SAK Mar 29 12:12:04 host kernel: SAK: killed process 3525 (Xorg): p-signal-session==tty-session Mar 29 12:12:13 host kernel: SysRq : Emergency Sync Mar 29 12:12:13 host kernel: Emergency Sync complete Mar 29 12:12:16 host kernel: SysRq : Emergency Remount R/O Mar 29 12:12:30 host kernel: SysRq : Terminate All Tasks Mar 29 12:12:30 host ntpd[3224]: ntpd exiting on signal 15 Mar 29 12:12:30 host syslog-ng[2742]: SIGTERM received, terminating; - rebootet via sysrq-b -- It seems the problem (host hang) only occurs when using ps on a paused context. When using # chcontext --xid 4004 --flag 0x400 -- bash it pauses, and can be unpaused with # vattribute --set --xid 4004 --flag ~0x400 BTW: Thanks for the tip with vattribute ! Andreas ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] Host guest
On Monday 03 April 2006 11:58, Albert Shih wrote: Hi All I've some strange problem with my guest. I've configured a vserver (the guest) with FC4 (the host too). On the guest (after vserver name enter) everthing work. On the host everything work too. But if I make a ssh connection to the IP adresse of the GuestI'm log into the Host. What's wrong with my install ? Lots of thnaks. Did you uncomment ListenAddress in your host's sshd config file and enter the correct address ? This prevents the host sshd to bind on all addresses including the vserver ones. You would have to restart your host's sshd after that change. Andreas Baetz -- Albert SHIH Universite de Paris 7 (Denis DIDEROT) U.F.R. de Mathematiques. 7 ième étage, plateau D, bureau 10 Heure local/Local time: Mon Apr 3 11:56:30 CEST 2006 ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
[Vserver] kernel bug
Because of the CPU scheduling problems I have with 2.6.18.2 and vserver vs2.0.2.2-rc6 I tried 2.6.18.3 and vs2.0.2.2-rc7. The patch applied cleanly to a vanilla kernel. The kernel compiled ok. After booting I tried vserver deb4 start, that failed and I got kernel BUG at kernel/vserver/network.c:147! host kernel: invalid opcode: [#2] host kernel: PREEMPT host kernel: CPU:0 host kernel: EIP is at unhash_nx_info+0x6e/0x90 host kernel: eax: 0100 ebx: f6c7eee0 ecx: 0001 edx: e8326000 host kernel: esi: e824ba90 edi: 0010 ebp: c17efa90 esp: e8327f64 host kernel: ds: 007b es: 007b ss: 0068 host kernel: Process chbind (pid: 4168[#0], ti=e8326000 task=e824ba90 task.ti=e8326000) host kernel: Stack: f6c7eee0 0004 e8327f9c c012296e f6c7eee0 ff00 e8f1da40 host kernel: e8327f9c e824bb48 e824bb80 ff00 e8f1da40 e824bb48 e8327f9c e8327f9c host kernel: 00ff 0401a8c0 e8326000 c0122a4d c01031e1 00ff 0804c81b host kernel: Call Trace: host kernel: Code: 04 c7 03 00 01 10 00 c7 43 04 00 02 20 00 b8 01 00 00 00 e8 e5 ed fd ff 89 e0 25 00 e0 ff ff 8b 40 08 a8 08 75 0f 83 c4 08 5b c3 0f 0b 93 00 1f 5a 43 c0 eb b8 83 c4 08 5b e9 5f 54 2c 00 eb 0d host kernel: EIP: [c013aebe] unhash_nx_info+0x6e/0x90 SS:ESP 0068:e8327f64 I'm back to 2.6.18.2 now. Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] kernel bug
On Thursday 23 November 2006 18:49, Herbert Poetzl wrote: On Thu, Nov 23, 2006 at 02:43:13AM +0100, Herbert Poetzl wrote: On Wed, Nov 22, 2006 at 01:28:42PM +0100, Andreas Baetz wrote: Because of the CPU scheduling problems I have with 2.6.18.2 and vserver vs2.0.2.2-rc6 I tried 2.6.18.3 and vs2.0.2.2-rc7. The patch applied cleanly to a vanilla kernel. The kernel compiled ok. After booting I tried vserver deb4 start, that failed and I got kernel BUG at kernel/vserver/network.c:147! host kernel: invalid opcode: [#2] host kernel: PREEMPT host kernel: CPU:0 host kernel: EIP is at unhash_nx_info+0x6e/0x90 host kernel: eax: 0100 ebx: f6c7eee0 ecx: 0001 edx: e8326000 host kernel: esi: e824ba90 edi: 0010 ebp: c17efa90 esp: e8327f64 host kernel: ds: 007b es: 007b ss: 0068 host kernel: Process chbind (pid: 4168[#0], ti=e8326000 task=e824ba90 task.ti=e8326000) host kernel: Stack: f6c7eee0 0004 e8327f9c c012296e f6c7eee0 ff00 e8f1da40 host kernel: e8327f9c e824bb48 e824bb80 ff00 e8f1da40 e824bb48 e8327f9c e8327f9c host kernel: 00ff 0401a8c0 e8326000 c0122a4d c01031e1 00ff 0804c81b host kernel: Call Trace: host kernel: Code: 04 c7 03 00 01 10 00 c7 43 04 00 02 20 00 b8 01 00 00 00 e8 e5 ed fd ff 89 e0 25 00 e0 ff ff 8b 40 08 a8 08 75 0f 83 c4 08 5b c3 0f 0b 93 00 1f 5a 43 c0 eb b8 83 c4 08 5b e9 5f 54 2c 00 eb 0d host kernel: EIP: [c013aebe] unhash_nx_info+0x6e/0x90 SS:ESP 0068:e8327f64 I'm back to 2.6.18.2 now. thanks, should be fixed in the next release vs2.0.2.2-rc8 is out ... I tried vs2.0.2.2-rc8 with 2.6.18.3, the vserver starts ok, no errors, but when I stopped it, the whole system freezed. Right after Deconfiguring network interfaces...done. Nothing worked besides magic sysreq-boot. Nothing in the syslog. Didn't try a second time due to lack of time. Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] vserver cpu scheduling
On Thursday 23 November 2006 20:29, Herbert Poetzl wrote: On Tue, Nov 21, 2006 at 10:05:07AM +0100, Andreas Baetz wrote: Hi, I updated to another kernel (2.6.18.2) and another vserver version (vs2.0.2.2-rc6), and I think the hard cpu scheduling doesn't work as expected. works here as expected (with 2.6.18.3-vs2.0.2.2-rc8) What I'm trying to do is to limit the CPU cycles for xid 8004 (deb4). In vserver 8004 the following command is running: cat /dev/zero | gzip | gzip | gzip /dev/null hmm, strange, could you try with the following sequence (see links for the sources) and let me know what top and vtop report on that? vcmd -i 666 -BC ctx_create .flagword=^34^33^32^8 -- cpuhog I tested with 2.6.18.2-vs2.0.2.2-rc6, because 2.6.18.3-vs2.0.2.2-rc8 freezed, see my other post. CPU Scheduling doesn't seem to work. vtop: op - 08:30:31 up 39 min, 6 users, load average: 0.38, 0.31, 0.20 Tasks: 160 total, 7 running, 153 sleeping, 0 stopped, 0 zombie Cpu(s): 98.7%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem:904876k total, 559776k used, 345100k free,52368k buffers Swap: 1003960k total,0k used, 1003960k free, 313956k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 8219 root 25 0 1424 244 192 R 94.7 0.0 0:17.17 cpuhog top: top - 08:31:43 up 41 min, 6 users, load average: 1.14, 0.58, 0.30 Tasks: 146 total, 3 running, 143 sleeping, 0 stopped, 0 zombie Cpu(s): 98.0%us, 1.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem:904876k total, 561456k used, 343420k free,52552k buffers Swap: 1003960k total,0k used, 1003960k free, 314244k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3768 root 15 0 53680 11m 4680 S 2.0 1.3 1:07.79 Xorg Maybe there is generally something wrong with my config ? Do I need Limit the idle task in the kernel config ? I always used a vanilla kernel for patching in all cases. Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] kernel bug
On Sunday 26 November 2006 23:22, Herbert Poetzl wrote: On Fri, Nov 24, 2006 at 08:11:39AM +0100, Andreas Baetz wrote: On Thursday 23 November 2006 18:49, Herbert Poetzl wrote: On Thu, Nov 23, 2006 at 02:43:13AM +0100, Herbert Poetzl wrote: thanks, should be fixed in the next release vs2.0.2.2-rc8 is out ... I tried vs2.0.2.2-rc8 with 2.6.18.3, the vserver starts ok, no errors, but when I stopped it, the whole system freezed. Right after Deconfiguring network interfaces...done. hmm, on the guest or host? if on the guest, what does the 'Deconfiguring' do? It's on the guest, a debian installation. /etc/init.d/networking: .. log_action_begin_msg Deconfiguring network interfaces if ifdown -a --exclude=lo; then log_action_end_msg $? else log_action_end_msg $? fi ;; .. In the meantime I found out that some other scripts are executed after the above, so this should be not the problem. I noticed something interesting, though: 1) I did put some delays (sleep) into some of the scripts in ../rc0.d to find out which is the source of the problem, and I think that the bug shows after a certain time and not after a certain command. 2) I have a vserver (deb3) that doesn't crash the machine when I do vserver deb3 stop. The config files on the host for both vservers are identical. 3) Both vservers are running ok, the crash only occurs when trying to stop deb4, i.e. when deb4's rc0.d scripts are executed via /usr/lib/util-vserver/vserver.stop. Nothing worked besides magic sysreq-boot. well, that is at least something ... would have been interesting to get a process dump (which should work with SYSRQ-T) In the meantime I produced some of these crashes, and the kernel always reports something which has to do with interrupt handling, the reported process was always the one that was executed in the vserver's rc0.d script, for example sleep while testing with the delays mentioned above. Nothing in the syslog. Didn't try a second time due o lack of time. okay, maybe you get around, the stack trace of all processes would probably tell us more ... TIA, Herbert Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] kernel bug
On Tuesday 28 November 2006 15:54, Herbert Poetzl wrote: On Tue, Nov 28, 2006 at 08:11:35AM +0100, Andreas Baetz wrote: On Sunday 26 November 2006 23:22, Herbert Poetzl wrote: On Fri, Nov 24, 2006 at 08:11:39AM +0100, Andreas Baetz wrote: On Thursday 23 November 2006 18:49, Herbert Poetzl wrote: On Thu, Nov 23, 2006 at 02:43:13AM +0100, Herbert Poetzl wrote: thanks, should be fixed in the next release vs2.0.2.2-rc8 is out ... I tried vs2.0.2.2-rc8 with 2.6.18.3, the vserver starts ok, no errors, but when I stopped it, the whole system freezed. Right after Deconfiguring network interfaces...done. okay, maybe you get around, the stack trace of all processes would probably tell us more ... I wrote down some of the trace output by hand: hmm, the numbers of those dumps would be interesting, especially if you have an unstripped kernel (vmlinux) available, so we can figure _where_ this happens so a serial console or some other means of recording them would be very helpful, if not available, try with a photo camera ... I did some more tests: At console 1: host:~# vserver deb4 enter deb4:/# .. Then I stopped all services in deb4 .. deb4:/# ps ax PID TTY STAT TIME COMMAND 1 ?Ss 0:00 init [2] 4999 ?S+ 0:00 login 5023 pts/0Ss 0:00 /bin/bash -login 5043 pts/0R+ 0:00 ps ax At console 2: host:~# vps ax|grep 8004 4999 8004 deb4 tty3 S+ 0:00 login 5023 8004 deb4 pts/0Ss+0:00 /bin/bash -login 5049 0 MAIN tty2 S+ 0:00 grep 8004 At console 1: deb4:/# hit CTRL-D EIP: [e2fd8894] 0xe2fd8894 SS:ESP 0068:e4711f20 1Fixing recursive fault but reboot is needed! host kernel: Oops: 0002 [#1] host kernel: PREEMPT host kernel: CPU:0 host kernel: EIP is at 0xe2fd8894 host kernel: eax: e2fd ebx: e2fd8930 ecx: 0001 edx: 0001 host kernel: esi: edi: e2fd8890 ebp: e4711f48 esp: e4711f20 host kernel: ds: 007b es: 007b ss: 0068 host kernel: Process vcontext (pid: 4638[#8004], ti=e471 task=e4334ab0 task.ti=e471) host kernel: Stack: c01195e3 e2fd 0001 0001 0001 host kernel: 0001 0286 e4711f6c c011b1af 0001 e2fd8890 host kernel: e4711f9c e4334ab0 0010 c17efa90 c01224b9 c011ac30 host kernel: Call Trace: host kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 94 88 fd e2 30 89 fd e2 02 00 00 00 00 00 00 00 2f 65 74 63 2f 76 73 65 72 host kernel: EIP: [e2fd8894] 0xe2fd8894 SS:ESP 0068:e4711f20 The same doesn't work with another vserver (deb3). With deb3, I can stop all services, then after CTRL-D there is no deb3 anymore in vserver-stat (I think that is how it is supposed to work) BUG: unable to handle kernel NULL pointer dereference at virtual address 0005 printing eip: c0104118 one of those addresses is listed above, you might be able to get thre required info with: addr2line -e vmlinux c0104118 another option to identify the location is the code sequence dumped at the end (2 digit block) TIA, Herbert Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] kernel bug
On Tuesday 28 November 2006 15:54, Herbert Poetzl wrote: On Tue, Nov 28, 2006 at 08:11:35AM +0100, Andreas Baetz wrote: On Sunday 26 November 2006 23:22, Herbert Poetzl wrote: On Fri, Nov 24, 2006 at 08:11:39AM +0100, Andreas Baetz wrote: On Thursday 23 November 2006 18:49, Herbert Poetzl wrote: On Thu, Nov 23, 2006 at 02:43:13AM +0100, Herbert Poetzl wrote: thanks, should be fixed in the next release vs2.0.2.2-rc8 is out ... I tried vs2.0.2.2-rc8 with 2.6.18.3, the vserver starts ok, no errors, but when I stopped it, the whole system freezed. Right after Deconfiguring network interfaces...done. okay, maybe you get around, the stack trace of all processes would probably tell us more ... I wrote down some of the trace output by hand: hmm, the numbers of those dumps would be interesting, especially if you have an unstripped kernel (vmlinux) available, so we can figure _where_ this happens so a serial console or some other means of recording them would be very helpful, if not available, try with a photo camera ... I did some more tests: At console 1: host:~# vserver deb4 enter deb4:/# .. Then I stopped all services in deb4 .. deb4:/# ps ax PID TTY STAT TIME COMMAND 1 ?Ss 0:00 init [2] 4999 ?S+ 0:00 login 5023 pts/0Ss 0:00 /bin/bash -login 5043 pts/0R+ 0:00 ps ax At console 2: host:~# vps ax|grep 8004 4999 8004 deb4 tty3 S+ 0:00 login 5023 8004 deb4 pts/0Ss+0:00 /bin/bash -login 5049 0 MAIN tty2 S+ 0:00 grep 8004 At console 1: deb4:/# hit CTRL-D EIP: [e2fd8894] 0xe2fd8894 SS:ESP 0068:e4711f20 1Fixing recursive fault but reboot is needed! host kernel: Oops: 0002 [#1] host kernel: PREEMPT host kernel: CPU:0 host kernel: EIP is at 0xe2fd8894 host kernel: eax: e2fd ebx: e2fd8930 ecx: 0001 edx: 0001 host kernel: esi: edi: e2fd8890 ebp: e4711f48 esp: e4711f20 host kernel: ds: 007b es: 007b ss: 0068 host kernel: Process vcontext (pid: 4638[#8004], ti=e471 task=e4334ab0 task.ti=e471) host kernel: Stack: c01195e3 e2fd 0001 0001 0001 host kernel: 0001 0286 e4711f6c c011b1af 0001 e2fd8890 host kernel: e4711f9c e4334ab0 0010 c17efa90 c01224b9 c011ac30 host kernel: Call Trace: host kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 94 88 fd e2 30 89 fd e2 02 00 00 00 00 00 00 00 2f 65 74 63 2f 76 73 65 72 host kernel: EIP: [e2fd8894] 0xe2fd8894 SS:ESP 0068:e4711f20 some more info: I copied the / of a working vserver and used it as / of deb4. vserver deb4 stop now works. It seems that something inside the / of the old deb4 is causing the system to crash when no more processes are running with that xid. So if a user of a certain vserver manages to create that condition in a vserver, then ending all processes in that vserver, the user could manage to crash the host. Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] kernel bug
On Wednesday 06 December 2006 21:50, Herbert Poetzl wrote: On Wed, Dec 06, 2006 at 12:11:44PM +0100, Andreas Baetz wrote: On Tuesday 28 November 2006 15:54, Herbert Poetzl wrote: On Tue, Nov 28, 2006 at 08:11:35AM +0100, Andreas Baetz wrote: On Sunday 26 November 2006 23:22, Herbert Poetzl wrote: On Fri, Nov 24, 2006 at 08:11:39AM +0100, Andreas Baetz wrote: On Thursday 23 November 2006 18:49, Herbert Poetzl wrote: On Thu, Nov 23, 2006 at 02:43:13AM +0100, Herbert Poetzl wrote: thanks, should be fixed in the next release vs2.0.2.2-rc8 is out ... I tried vs2.0.2.2-rc8 with 2.6.18.3, the vserver starts ok, no errors, but when I stopped it, the whole system freezed. Right after Deconfiguring network interfaces...done. okay, maybe you get around, the stack trace of all processes would probably tell us more ... I wrote down some of the trace output by hand: hmm, the numbers of those dumps would be interesting, especially if you have an unstripped kernel (vmlinux) available, so we can figure _where_ this happens so a serial console or some other means of recording them would be very helpful, if not available, try with a photo camera ... some more info: I copied the / of a working vserver and used it as / of deb4. vserver deb4 stop now works. It seems that something inside the / of the old deb4 is causing the system to crash when no more processes are running with that xid. So if a user of a certain vserver manages to create that condition in a vserver, then ending all processes in that vserver, the user could manage to crash the host. yes, please try if you can reproduce that with http://vserver.13thfloor.at/Experimental/patch-2.6.18.5-vs2.1.1.3.diff if yes, we should have the necessary debugging harnish to track that down, if it can't be recreated there, consider it already fixed in the next stable release ... TIA, Herbert With 2.6.18.5 and patch-2.6.18.5-vs2.1.1.3.diff the bug is gone. Also, hard CPU scheduling works again. Great work, many thanks !!! Andreas ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ** ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver