Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-31 Thread Konstantin Khlebnikov
Hi Patrick,

Also it possible is a kernel stack overflow.
Disable CONFIG_4KSTACKS option in your kernel config.

-- 
signature



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-30 Thread Ola Lundqvist
Hi Patrick

On Fri, May 30, 2008 at 09:33:35AM +0200, Patrick Schoenfeld wrote:
 Hi Ola,
 
 On Fri, May 30, 2008 at 08:10:35AM +0200, Ola Lundqvist wrote:
  I have not got any response from upstream yet. Good to know that this 
  problem
  is reproduceable on a newer kernel as well.
 
 not really. I would have hoped that the newer patch fixes my problem.
 :-)

Of couse. :) What I ment that it is good to know the status, even if the
information is bad in itself. ;)

  It is a little bit strange that no-one else has seen this. On the other hand
  few people seem to run openvz on amd64.
 
 Agreed. I'm also a little bit puzzled, because I run the patch on
 several systems and this is the only one were the problems occur.
 However it is not on amd64. The data in the first report might be wrong,
 cause I probably filed it on my desktop system.

Oh I see. Good information. This makes me suspect hardware related. Not
necessarily a fault but maybe something in the patch that is incompatible.

Are you running some other system with identical hardware, where you do not
have the crash. Or even though it is not identical, is it similar?

 The system were it runs on is a
 
 Intel(R) Celeron(R) CPU 2.40GHz
 
 according to /proc/cpuinfo

Ok, yes this is good information. I would like to know as much as you can
tell about the hardware, related especially to the memory handling. It can
very well be a timing issue.

* Number of CPU:s
* Amount of memory
* Memory type and speed
* Anything else that you can find that you think can be interesting

 Tell me, if you need more information and sorry for the confusion.
 
 BTW. I suspected first that this is hardware related, but my hoster did
 an extensive test of the hardware and found no problem. The evidence is
 still not gone, but I don't have anything to slap the hoster with
 currently. The panic does not really indicate something hw related.

It is indicating something with paging, which relates to memory handling.
It do not necessarily need to be a fault, but could be something related.

Best regards,

// Ola

 Best Regards,
 
 Patrick
 

-- 
 --- Inguza Technology AB --- MSc in Information Technology 
/  [EMAIL PROTECTED]Annebergsslingan 37\
|  [EMAIL PROTECTED]   654 65 KARLSTAD|
|  http://inguza.com/Mobile: +46 (0)70-332 1551 |
\  gpg/f.p.: 7090 A92B 18FE 7994 0C36 4FE4 18A1 B1CF 0FE5 3DD9  /
 ---



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-30 Thread Konstantin Khlebnikov

Hi

Ola,
I decode oops -- page fault occurs at tsk-mm dereference in 
do_page_fault function (arch/i386/mm/fault.c:353)
current task pointer is incorrect, perhaps something bad happened with 
vcpu scheduler.


Patrick,
May you try boot with our precompiled rhel5 based openvz kernel?
I think it almost debian compatible, extract it from rpm (by rpm2cpio) 
or use alien and generate initrd. I can explain all steps, if needed.

http://download.openvz.org/kernel/branches/rhel5-2.6.18/current/



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-30 Thread Patrick Schoenfeld
Hi Konstantin,

On Fri, May 30, 2008 at 07:45:19PM +0400, Konstantin Khlebnikov wrote:
 I decode oops -- page fault occurs at tsk-mm dereference in  
 do_page_fault function (arch/i386/mm/fault.c:353)
 current task pointer is incorrect, perhaps something bad happened with  
 vcpu scheduler.

thanks for your input on this issue. Do you think this could be a bug?
Or do you think this could be caused by - for example - bad memory?
Because I have my memory in suspicioun as system seems to detect only
~ 350 MB memory out of 512M while lshw even says that the RAM module is
a 256MB module only.

 Patrick,
 May you try boot with our precompiled rhel5 based openvz kernel?
 I think it almost debian compatible, extract it from rpm (by rpm2cpio)  

I could do so for testing but I would not feel good doing so for
production use. What do you intend to test with that? Why should it be
different?

Best Regards,
Patrick



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-30 Thread Konstantin Khlebnikov

Patrick Schoenfeld wrote:

Hi Konstantin,

On Fri, May 30, 2008 at 07:45:19PM +0400, Konstantin Khlebnikov wrote:
  

I decode oops -- page fault occurs at tsk-mm dereference in
do_page_fault function (arch/i386/mm/fault.c:353)
current task pointer is incorrect, perhaps something bad happened with
vcpu scheduler.



thanks for your input on this issue. Do you think this could be a bug?
Or do you think this could be caused by - for example - bad memory?
Because I have my memory in suspicioun as system seems to detect only
~ 350 MB memory out of 512M while lshw even says that the RAM module is
a 256MB module only.
  

Maybe. Test it with memtest and optionally combined with cpuburn.
http://wiki.openvz.org/Hardware_testing
  

Patrick,
May you try boot with our precompiled rhel5 based openvz kernel?
I think it almost debian compatible, extract it from rpm (by rpm2cpio)



I could do so for testing but I would not feel good doing so for
production use. What do you intend to test with that? Why should it be
different?
  
That is different patches, maybe something was broken while porting to 
debian kernel.

Best Regards,
Patrick



--
To unsubscribe, send mail to [EMAIL PROTECTED]
  





--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: Fwd: Re: Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-28 Thread Patrick Schoenfeld
Forgotten to CC the bugtracker.
---BeginMessage---
Hi Ola,

On Tue, May 27, 2008 at 10:38:24PM +0200, Ola Lundqvist wrote:
 Ok, good to know that so far. Tell me if you see the problem again.

it happened faster as expected... unfortune.

Here is the new kernel panic log.
Have you gotten any feedback from upstream? For me this is very
critical, because currently the panics happen almost every day, blocking
the server for some time, till I reboot the system. And I'm not very
lucky with the need to cold reset the system cause I fear potential data
loss.

Best Regards,
Patrick
_conntrack_amanda vzethdev vznetdev simfs vzrst ip_nat vzcpt ip_conntrack tun 
vzdquota vzmon vzdev af_packet xt_tcpudp xt_length ipt_ttl xt_tcpmss ipt_TCPMSS 
iptable_mangle iptable_filter xt_multiport xt_limit ipt_tos ipt_REJECT 
ip_tables x_tables dummy i2c_i801 shpchp i2c_core xfs e100 mii thermal 
processor fan
CPU:0, VCPU: 1768697699.1819113570
EIP:0060:[c011464a]Not tainted VLI
EFLAGS: 00010046   (2.6.18+openvz #2 028stab053.5)
EIP is at do_page_fault+0x5a/0x5a5
eax: cde72000   ebx:    ecx: 007b   edx: 
esi: 00030001   edi: 0094   ebp: 0006   esp: cde72050
ds: 007b   es: 007b   ss: 0068
BUG: unable to handle kernel paging request at virtual address 7261687b
 printing eip:
c010422d
*pde = 
Recursive die() failure, output suppressed
Kernel panic - not syncing: Attempted to kill the idle task!
BUG: unable to handle kernel paging request at virtual address 7261716f
 printing eip:
c01183a3
*pde = 
Oops:  [#3]
Modules linked in: ipt_LOG xt_state iptable_nat ip_nat_tftp ip_nat_irc 
ip_nat_ftp ip_nat_amanda xt_conntrack ip_conntrack_tftp ip_conntrack_irc 
ip_conntrack_ftp ts_kmp ip_conntrack_amanda vzethdev vznetdev simfs vzrst 
ip_nat vzcpt ip_conntrack tun vzdquota vzmon vzdev af_packet xt_tcpudp 
xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle iptable_filter 
xt_multiport xt_limit ipt_tos ipt_REJECT ip_tables x_tables dummy i2c_i801 
shpchp i2c_core xfs e100 mii thermal processor fan
CPU:0, VCPU: 1768697699.1819113570
EIP:0060:[c01183a3]Not tainted VLI
EFLAGS: 00010002   (2.6.18+openvz #2 028stab053.5)
EIP is at account_system_time+0x23/0xc0
eax: 72616873   ebx: 11425ae0   ecx:    edx: 0001
esi: 11425ae0   edi:    ebp: c058bf90   esp: c058bf88
ds: 007b   es: 007b   ss: 0068
BUG: unable to handle kernel paging request at virtual address 7261687b
 printing eip:
c010422d
*pde = 
Oops:  [#4]
Modules linked in: ipt_LOG xt_state iptable_nat ip_nat_tftp ip_nat_irc 
ip_nat_ftp ip_nat_amanda xt_conntrack ip_conntrack_tftp ip_conntrack_irc 
ip_conntrack_ftp ts_kmp ip_conntrack_amanda vzethdev vznetdev simfs vzrst 
ip_nat vzcpt ip_conntrack tun vzdquota vzmon vzdev af_packet xt_tcpudp 
xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle iptable_filter 
xt_multiport xt_limit ipt_tos ipt_REJECT ip_tables x_tables dummy i2c_i801 
shpchp i2c_core xfs e100 mii thermal processor fan
CPU:0, VCPU: 1768697699.1819113570
EIP:0060:[c010422d]Not tainted VLI
EFLAGS: 00010086   (2.6.18+openvz #2 028stab053.5)
EIP is at show_registers+0x15d/0x260
eax: 11425ae0   ebx: 0002   ecx: c058b000   edx: 72616873
esi: 00010002   edi: c058bf54   ebp: c058bf88   esp: c058be9c
ds: 007b   es: 007b   ss: 0068
BUG: unable to handle kernel paging request at virtual address 7261687b
 printing eip:
c010422d
*pde = 
Recursive die() failure, output suppressed
Kernel panic - not syncing: Fatal exception in interrupt
---End Message---


Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-27 Thread Ola Lundqvist

Hi Patrick

Ok, good to know that so far. Tell me if you see the problem again.

Best regards,

// Ola

Quoting Patrick Schoenfeld [EMAIL PROTECTED]:


Hi,

On Sun, May 25, 2008 at 02:09:44PM +0200, Ola Lundqvist wrote:

It would be good to know if you have the same problem with the kernel
version in unstable. The reason is that, that version has really a lot
of bugs fixed. It will apply fine to the kernel in stable (as far as I
have tested at least).


I built the kernel with the latest unstable patch applied and it builts
fine. However to see if the error happens again needs some time, because
it occurs sporadic and not in a regular fashion.

Best Regards,
Patrick






--
 --- Ola Lundqvist systemkonsult --- M Sc in IT Engineering 
/  [EMAIL PROTECTED]   Annebergsslingan 37\
|  [EMAIL PROTECTED]   654 65 KARLSTAD|
|  http://opalsys.net/   Mobile: +46 (0)70-332 1551 |
\  gpg/f.p.: 7090 A92B 18FE 7994 0C36 4FE4 18A1 B1CF 0FE5 3DD9  /
 ---



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-25 Thread Ola Lundqvist
Hi Patrick

It would be good to know if you have the same problem with the kernel
version in unstable. The reason is that, that version has really a lot
of bugs fixed. It will apply fine to the kernel in stable (as far as I
have tested at least).

Best regards,

// Ola

On Sat, May 24, 2008 at 11:40:03AM +0200, Patrick Schoenfeld wrote:
 Package: kernel-patch-openvz
 Version: 028.18.1+etch6
 Severity: critical
 
 Hi,
 
 the OpenVZ kernel patch causes random kernel panics on a production
 server. Unfortunately I couldn't capture the whole output of the panic
 ofter the serial console (because screen does not seem to capture
 if the screen is not attached :(). The file that I will attach in a few
 minutes is all I got.
 
 I set this to severity serious because it totally breaks my system (it
 panics and is unresponsive after that, till I reset it) and also is
 responsible for potential data loss (no syncing happens, kernel needs to
 be reset without flushing buffers).
 
 There is another issue with IPv6 beeing enabled and the OpenVZ patches.
 It then panics on reboot, causing the reboot to fail. But that just as
 an unrelated sidenote (the current .config has IPv6 disabled).
 
 The issue did not happen with a stock kernel.
 
 Any ideas?
 
 Best Regards,
 Patrick
 
 -- System Information:
 Debian Release: lenny/sid
   APT prefers unstable
   APT policy: (500, 'unstable'), (500, 'stable')
 Architecture: i386 (i686)
 
 Kernel: Linux 2.6.24-1-686 (SMP w/1 CPU core)
 Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
 Shell: /bin/sh linked to /bin/dash
 
 
 

-- 
 - Ola Lundqvist ---
/  [EMAIL PROTECTED] Annebergsslingan 37  \
|  [EMAIL PROTECTED]  654 65 KARLSTAD  |
|  http://inguza.com/  +46 (0)70-332 1551   |
\  gpg/f.p.: 7090 A92B 18FE 7994 0C36  4FE4 18A1 B1CF 0FE5 3DD9 /
 ---



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#482657: kernel-patch-openvz causes a kernel panic

2008-05-24 Thread Patrick Schoenfeld
Package: kernel-patch-openvz
Version: 028.18.1+etch6
Severity: critical

Hi,

the OpenVZ kernel patch causes random kernel panics on a production
server. Unfortunately I couldn't capture the whole output of the panic
ofter the serial console (because screen does not seem to capture
if the screen is not attached :(). The file that I will attach in a few
minutes is all I got.

I set this to severity serious because it totally breaks my system (it
panics and is unresponsive after that, till I reset it) and also is
responsible for potential data loss (no syncing happens, kernel needs to
be reset without flushing buffers).

There is another issue with IPv6 beeing enabled and the OpenVZ patches.
It then panics on reboot, causing the reboot to fail. But that just as
an unrelated sidenote (the current .config has IPv6 disabled).

The issue did not happen with a stock kernel.

Any ideas?

Best Regards,
Patrick

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.24-1-686 (SMP w/1 CPU core)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]