Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Ralf Hildebrandt
* Randy.Dunlap <[EMAIL PROTECTED]>:

> >Is it normal that the kernel with debugging enabled is not larger than
> >the normal kernel?
> >-
> 
> No, it should be much larger.  Recheck the .config file
> for CONFIG_DEBUG_INFO=y.  Maybe you need to do 'make clean'
> first.

CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_INFO=y
# CONFIG_FRAME_POINTER is not set
CONFIG_EARLY_PRINTK=y

I built that using "make-kpkg"

make-kpkg clean
CONCURRENCY_LEVEL=4 MAKEFLAGS="CC=gcc-3.4" make-kpkg --revision=20050217 
kernel_image

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Randy.Dunlap
Ralf Hildebrandt wrote:
* Ralf Hildebrandt <[EMAIL PROTECTED]>:

The best way to do that is to ensure that the kernel was built with
CONFIG_DEBUG_INFO, note the offending EIP value, then do
# gdb vmlinux
(gdb) l *0xc0
I'm rebuilding the ac12 kernel which crashed on me after just one day
and will reboot it today.

Is it normal that the kernel with debugging enabled is not larger than
the normal kernel?
-
No, it should be much larger.  Recheck the .config file
for CONFIG_DEBUG_INFO=y.  Maybe you need to do 'make clean'
first.
--
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Ralf Hildebrandt
* Ralf Hildebrandt <[EMAIL PROTECTED]>:

> > The best way to do that is to ensure that the kernel was built with
> > CONFIG_DEBUG_INFO, note the offending EIP value, then do
> > 
> > # gdb vmlinux
> > (gdb) l *0xc0
> 
> I'm rebuilding the ac12 kernel which crashed on me after just one day
> and will reboot it today.

Is it normal that the kernel with debugging enabled is not larger than
the normal kernel?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Ralf Hildebrandt
* Andrew Morton <[EMAIL PROTECTED]>:

> There have been a handful of reports - there's surely a race in there.
> 
> Unfortunately I've yet to see a report from which we can identify the
> offending line in the very large journal_commit_transaction() function.

:(

> 
> The best way to do that is to ensure that the kernel was built with
> CONFIG_DEBUG_INFO, note the offending EIP value, then do
> 
> # gdb vmlinux
> (gdb) l *0xc0

I'm rebuilding the ac12 kernel which crashed on me after just one day
and will reboot it today.

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Ralf Hildebrandt
* Andrew Morton [EMAIL PROTECTED]:

 There have been a handful of reports - there's surely a race in there.
 
 Unfortunately I've yet to see a report from which we can identify the
 offending line in the very large journal_commit_transaction() function.

:(

 
 The best way to do that is to ensure that the kernel was built with
 CONFIG_DEBUG_INFO, note the offending EIP value, then do
 
 # gdb vmlinux
 (gdb) l *0xc0whatever

I'm rebuilding the ac12 kernel which crashed on me after just one day
and will reboot it today.

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Ralf Hildebrandt
* Ralf Hildebrandt [EMAIL PROTECTED]:

  The best way to do that is to ensure that the kernel was built with
  CONFIG_DEBUG_INFO, note the offending EIP value, then do
  
  # gdb vmlinux
  (gdb) l *0xc0whatever
 
 I'm rebuilding the ac12 kernel which crashed on me after just one day
 and will reboot it today.

Is it normal that the kernel with debugging enabled is not larger than
the normal kernel?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Randy.Dunlap
Ralf Hildebrandt wrote:
* Ralf Hildebrandt [EMAIL PROTECTED]:

The best way to do that is to ensure that the kernel was built with
CONFIG_DEBUG_INFO, note the offending EIP value, then do
# gdb vmlinux
(gdb) l *0xc0whatever
I'm rebuilding the ac12 kernel which crashed on me after just one day
and will reboot it today.

Is it normal that the kernel with debugging enabled is not larger than
the normal kernel?
-
No, it should be much larger.  Recheck the .config file
for CONFIG_DEBUG_INFO=y.  Maybe you need to do 'make clean'
first.
--
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-17 Thread Ralf Hildebrandt
* Randy.Dunlap [EMAIL PROTECTED]:

 Is it normal that the kernel with debugging enabled is not larger than
 the normal kernel?
 -
 
 No, it should be much larger.  Recheck the .config file
 for CONFIG_DEBUG_INFO=y.  Maybe you need to do 'make clean'
 first.

CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_INFO=y
# CONFIG_FRAME_POINTER is not set
CONFIG_EARLY_PRINTK=y

I built that using make-kpkg

make-kpkg clean
CONCURRENCY_LEVEL=4 MAKEFLAGS=CC=gcc-3.4 make-kpkg --revision=20050217 
kernel_image

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Andrew Morton
Dale Blount <[EMAIL PROTECTED]> wrote:
>
> This looks very similar (at least to me) to an OOPS I posted with 2.6.9
> on 12/03/2004.
> http://marc.theaimsgroup.com/?l=linux-kernel=110210705504716=2

There have been a handful of reports - there's surely a race in there.

Unfortunately I've yet to see a report from which we can identify the
offending line in the very large journal_commit_transaction() function.

The best way to do that is to ensure that the kernel was built with
CONFIG_DEBUG_INFO, note the offending EIP value, then do

# gdb vmlinux
(gdb) l *0xc0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Ralf Hildebrandt
* Dale Blount <[EMAIL PROTECTED]>:

> This looks very similar (at least to me) to an OOPS I posted with 2.6.9
> on 12/03/2004.
> http://marc.theaimsgroup.com/?l=linux-kernel=110210705504716=2

Could be.

> My system is also a dual Xeon using SMP and Hyperthreading
> (/proc/cpuinfo shows 4 cpus).

Same system here.

> Mine, like Ralf's, is also a mail server running postfix using ext3 for
> the spool directory.

Same here.

> I've actually hit this bug (assuming it's the same) with 2.6.10 also.  I
> had to power cycle remotely and unfortunately didn't have the serial
> console logging enabled when it happened with 2.6.10.  I upgraded from
> 2.4.23 to 2.6.8.1 and crashed within a week, and continued to crash at
> least monthly after that.  It had been running 2.4.23 for 200+ days with
> no problems.
> 
> Hope this helps trace it back.

Me too


-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Dale Blount
On Wed, 2005-02-16 at 21:04 +0100, Ralf Hildebrandt wrote:
> * Jan Kara <[EMAIL PROTECTED]>:
> 
> >   I guess the system is SMP...
> 
> Indeed it is. Dual Xeon with SMP.
> 

This looks very similar (at least to me) to an OOPS I posted with 2.6.9
on 12/03/2004.
http://marc.theaimsgroup.com/?l=linux-kernel=110210705504716=2

My system is also a dual Xeon using SMP and Hyperthreading
(/proc/cpuinfo shows 4 cpus).
Mine, like Ralf's, is also a mail server running postfix using ext3 for
the spool directory.

> > but it seems similar like a several other oopses I've seen reported
> > recently. Is this the first time you hit this bug?
> 
> It's actually the second time. The first time it hit the SAME box but
> with kernel-2.6.10 (vanilla) after 30 days of uptime. Nobody had a
> camera at hand, so I couldn't take a photo.
> 

I've actually hit this bug (assuming it's the same) with 2.6.10 also.  I
had to power cycle remotely and unfortunately didn't have the serial
console logging enabled when it happened with 2.6.10.  I upgraded from
2.4.23 to 2.6.8.1 and crashed within a week, and continued to crash at
least monthly after that.  It had been running 2.4.23 for 200+ days with
no problems.

Hope this helps trace it back.

Dale

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Ralf Hildebrandt
* Jan Kara <[EMAIL PROTECTED]>:

>   I guess the system is SMP...

Indeed it is. Dual Xeon with SMP.

>   Sadly a few lines in the beginning of the
> report are missing (probably scrolled off the screen)

Yes, this sucks. I rebooted with vesafb active, no I do have 50 lines :)

> but it seems similar like a several other oopses I've seen reported
> recently. Is this the first time you hit this bug?

It's actually the second time. The first time it hit the SAME box but
with kernel-2.6.10 (vanilla) after 30 days of uptime. Nobody had a
camera at hand, so I couldn't take a photo.

Any suggestions? I'm open to suggestions. One difference between the
2.6.10 and 2.6.10-ac12 was that 2.6.10 has no in-kernel irq
balancing, while in 2.6.10-ac12 I acivated that.

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Jan Kara
  Hello,

> Today our mailserver froze after just one day of uptime. I was able to
> capture the Oops on the screen using my digital camera:
> 
> http://www.stahl.bau.tu-bs.de/~hildeb/bugreport/
> 
> Keywords: EIP is at journal_commit_transaction, process kjournald
  I guess the system is SMP... Sadly a few lines in the beginning of the
report are missing (probably scrolled off the screen) but it seems
similar like a several other oopses I've seen reported recently. Is this
the first time you hit this bug?

> # mount
> /dev/cciss/c0d0p6 on / type ext3 (rw,errors=remount-ro)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> tmpfs on /dev/shm type tmpfs (rw)
> /dev/cciss/c0d0p5 on /boot type ext3 (rw)
> /dev/shm on /var/amavis type tmpfs 
> (rw,noatime,size=200m,mode=770,uid=104,gid=108)

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Jan Kara
  Hello,

 Today our mailserver froze after just one day of uptime. I was able to
 capture the Oops on the screen using my digital camera:
 
 http://www.stahl.bau.tu-bs.de/~hildeb/bugreport/
 
 Keywords: EIP is at journal_commit_transaction, process kjournald
  I guess the system is SMP... Sadly a few lines in the beginning of the
report are missing (probably scrolled off the screen) but it seems
similar like a several other oopses I've seen reported recently. Is this
the first time you hit this bug?

 # mount
 /dev/cciss/c0d0p6 on / type ext3 (rw,errors=remount-ro)
 proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw)
 devpts on /dev/pts type devpts (rw,gid=5,mode=620)
 tmpfs on /dev/shm type tmpfs (rw)
 /dev/cciss/c0d0p5 on /boot type ext3 (rw)
 /dev/shm on /var/amavis type tmpfs 
 (rw,noatime,size=200m,mode=770,uid=104,gid=108)

Honza
-- 
Jan Kara [EMAIL PROTECTED]
SuSE CR Labs
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Ralf Hildebrandt
* Jan Kara [EMAIL PROTECTED]:

   I guess the system is SMP...

Indeed it is. Dual Xeon with SMP.

   Sadly a few lines in the beginning of the
 report are missing (probably scrolled off the screen)

Yes, this sucks. I rebooted with vesafb active, no I do have 50 lines :)

 but it seems similar like a several other oopses I've seen reported
 recently. Is this the first time you hit this bug?

It's actually the second time. The first time it hit the SAME box but
with kernel-2.6.10 (vanilla) after 30 days of uptime. Nobody had a
camera at hand, so I couldn't take a photo.

Any suggestions? I'm open to suggestions. One difference between the
2.6.10 and 2.6.10-ac12 was that 2.6.10 has no in-kernel irq
balancing, while in 2.6.10-ac12 I acivated that.

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Dale Blount
On Wed, 2005-02-16 at 21:04 +0100, Ralf Hildebrandt wrote:
 * Jan Kara [EMAIL PROTECTED]:
 
I guess the system is SMP...
 
 Indeed it is. Dual Xeon with SMP.
 

This looks very similar (at least to me) to an OOPS I posted with 2.6.9
on 12/03/2004.
http://marc.theaimsgroup.com/?l=linux-kernelm=110210705504716w=2

My system is also a dual Xeon using SMP and Hyperthreading
(/proc/cpuinfo shows 4 cpus).
Mine, like Ralf's, is also a mail server running postfix using ext3 for
the spool directory.

  but it seems similar like a several other oopses I've seen reported
  recently. Is this the first time you hit this bug?
 
 It's actually the second time. The first time it hit the SAME box but
 with kernel-2.6.10 (vanilla) after 30 days of uptime. Nobody had a
 camera at hand, so I couldn't take a photo.
 

I've actually hit this bug (assuming it's the same) with 2.6.10 also.  I
had to power cycle remotely and unfortunately didn't have the serial
console logging enabled when it happened with 2.6.10.  I upgraded from
2.4.23 to 2.6.8.1 and crashed within a week, and continued to crash at
least monthly after that.  It had been running 2.4.23 for 200+ days with
no problems.

Hope this helps trace it back.

Dale

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Ralf Hildebrandt
* Dale Blount [EMAIL PROTECTED]:

 This looks very similar (at least to me) to an OOPS I posted with 2.6.9
 on 12/03/2004.
 http://marc.theaimsgroup.com/?l=linux-kernelm=110210705504716w=2

Could be.

 My system is also a dual Xeon using SMP and Hyperthreading
 (/proc/cpuinfo shows 4 cpus).

Same system here.

 Mine, like Ralf's, is also a mail server running postfix using ext3 for
 the spool directory.

Same here.

 I've actually hit this bug (assuming it's the same) with 2.6.10 also.  I
 had to power cycle remotely and unfortunately didn't have the serial
 console logging enabled when it happened with 2.6.10.  I upgraded from
 2.4.23 to 2.6.8.1 and crashed within a week, and continued to crash at
 least monthly after that.  It had been running 2.4.23 for 200+ days with
 no problems.
 
 Hope this helps trace it back.

Me too


-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)

2005-02-16 Thread Andrew Morton
Dale Blount [EMAIL PROTECTED] wrote:

 This looks very similar (at least to me) to an OOPS I posted with 2.6.9
 on 12/03/2004.
 http://marc.theaimsgroup.com/?l=linux-kernelm=110210705504716w=2

There have been a handful of reports - there's surely a race in there.

Unfortunately I've yet to see a report from which we can identify the
offending line in the very large journal_commit_transaction() function.

The best way to do that is to ensure that the kernel was built with
CONFIG_DEBUG_INFO, note the offending EIP value, then do

# gdb vmlinux
(gdb) l *0xc0whatever
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/