panic: integer divide fault on 6.1

2006-09-16 Thread Joao Barros

Hi all,

I was contacted by Chris who had the same problem.
Since I didn't follow up on my problem he asked me if it was solved
and indeed it was solved by Søren and the fix has been committed to
CURRENT:
http://lists.freebsd.org/pipermail/cvs-all/2006-September/188353.html

I'm hoping this gets MFC'ed in time for 6.2

My conversation with Søren is bellow:

-- Forwarded message --
From: Joao Barros [EMAIL PROTECTED]
Date: Sep 14, 2006 8:04 PM
Subject: Re: Fwd: panic: integer divide fault on 6.1
To: Søren Schmidt [EMAIL PROTECTED]


On 9/13/06, Joao Barros [EMAIL PROTECTED] wrote:

On 9/13/06, Søren Schmidt [EMAIL PROTECTED] wrote:
 Joao Barros wrote:
  -- Forwarded message --
  From: Sam Leffler [EMAIL PROTECTED]
  Date: Sep 10, 2006 5:16 PM
  Subject: Re: panic: integer divide fault on 6.1
  To: Joao Barros [EMAIL PROTECTED]
  Cc: freebsd-current@freebsd.org, freebsd-stable@freebsd.org, Kris
  Kennaway [EMAIL PROTECTED]
 
 
  Joao Barros wrote:
  On 9/9/06, Kris Kennaway [EMAIL PROTECTED] wrote:
  On Sat, Sep 09, 2006 at 09:02:35PM +0100, Joao Barros wrote:
   On 9/9/06, Max Laier [EMAIL PROTECTED] wrote:
   
   Can you try to get a dump, trace, or at least figure out which
  function
   the IP is refering to?
   
  
   Well, the problem only occurs when I boot from the disk and the
   installed kernel doesn't have debug support.
   Does 'set dumpdev=' work from the boot loader? I tried some
   combinations with no success.
 
  No.
 
   I can try and install a 6-STABLE snapshot if there's no way of
  getting
   the info needed.
 
  You can either try to install a new kernel with DDB support, or follow
  the instruction pointer method in the developers handbook chapter on
  kernel debugging.
 
  I copied a CURRENT kernel from a 200608 snapshot and the problem also
  occurs thus I'm adding [EMAIL PROTECTED]
  My current laptop doesn't have a serial port so I'm copying this by
  hand:
 
  Fatal trap 18: integer divide fault while in kernel mode
  cpuid = 0; apic id = 00
  instruction pointer = 0x20:0xc08a1fb7
  stack pointer   = 0x28:0xc0c20b14
  frame pointer   = 0x28:0xc0c20b9c
  code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 0 (swapper)
  [thread pid 0 tid 0 ]
  Stopped at__qdivrem+0x3b: divl%ecx,%eax
 
  db bt
  Tracing pid 0 tid0 td 0xc0a0c818
  __qdivrem(37fdfa0,0,0,0,0,...) at __qdivrem+0x3b
  __udivdi3(37fdfa0,0,0,0) at __udivdi3+0x16
  ata_raid_promise_read_meta(c37a5000,c09f4a80,1,8086,c37a5000,...) at
  ata_raid_promise_read_meta+0x9b
  ata_raid_read_metadata(c37a5000,c37a5000,c0c20c70,c06b58a4,c37a5000,...)
  at ata_raid_metadata+0x2be
  ata_raid_subdisk_attach(c37a5000) at ata_raid_subdisk_attach+0x33
  device_attach(c37a5000,c37a5180,c37a5000,c36885c0,0,...) at
  device_attach+0x58
  device_probe_and_attach(c37a5200,c37a5200,c08ec9a9,0,c37a5180,...) at
  bus_generic_attach+0x16
  ad_attach(c37a5200) at ad_attach+0x2c8
  device_attach(c37a5200,c095f2d0,c37a5200,0,c368d800,...) at
  device_attach+0x58
  device_probe_and_attach(c37a5200) at device_probe_and_atach+0xe0
  bus_generic_attach(c3659080,c3659080,,0,c37a5200,...) at
  bus_generic_attach+0x16
  ata_identify(c3659080) at ata_identify+0x1c8
  ata_boot_attach(0xc0a11d80,0,c09212e7,47,...) at ata_boot_attach+0x3e
  run_interrupt_drive_config_hooks(0,c1ec00,c1e000,0,c0451065,...) at
  run_interrupt_drive_config_hooks+0x43
  mi_startup() at mi_startup+0x96
  begin() at begin+0x2c
 
  This board has a Promise SATA raid controller and it is disabled in
  the BIOS. I even tried disabling it through a jumper but it still
  stops.
 
 
  In sys/dev/ata/ata-raid.h the PROMISE_LBA macro does an unchecked
  calculation that apparently can divide by zero.  Soren would likely
  understand the root cause of this problem but until then you can patch
  the driver to workaround the problem.
 
 Sam
 
 
  Hi Søren,
 
  I don't know if you bumped into this thread but this should definitely
  be fixed.
  Do you want me to open a PR?
 
 Hmm, the problem seems to be that the geometry thats gotten from the
 disk has (all) zero's in it, which we cannot handle.
 Its most likely because your BIOS put invalid or no current geometry
 info in the disks parameters page.

 If you are up to a little debugging, you could try this patch:

 diff -u -r1.189.2.4 ata-disk.c
 --- ata-disk.c  4 Apr 2006 16:07:42 -   1.189.2.4
 +++ ata-disk.c  13 Sep 2006 06:18:59 -
 @@ -97,7 +97,8 @@
  }
  device_set_ivars(dev, adp);

 -if (atadev-param.atavalid  ATA_FLAG_54_58) {
 +if ((atadev-param.atavalid  ATA_FLAG_54_58) 
 +   atadev-param.current_heads  atadev-param.current_sectors) {
 adp-heads = atadev-param.current_heads;
 adp-sectors = atadev-param.current_sectors;
 adp-total_secs = (u_int32_t

Re: panic: integer divide fault on 6.1

2006-09-16 Thread Søren Schmidt

Joao Barros wrote:

Hi all,

I was contacted by Chris who had the same problem.
Since I didn't follow up on my problem he asked me if it was solved
and indeed it was solved by Søren and the fix has been committed to
CURRENT:
http://lists.freebsd.org/pipermail/cvs-all/2006-September/188353.html

I'm hoping this gets MFC'ed in time for 6.2

It will be, I'm collecting fixes in -current that I'll get MFC'd soon...

-Søren

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: integer divide fault on 6.1

2006-09-10 Thread Joao Barros

On 9/9/06, Kris Kennaway [EMAIL PROTECTED] wrote:

On Sat, Sep 09, 2006 at 09:02:35PM +0100, Joao Barros wrote:
 On 9/9/06, Max Laier [EMAIL PROTECTED] wrote:
 
 Can you try to get a dump, trace, or at least figure out which function
 the IP is refering to?
 

 Well, the problem only occurs when I boot from the disk and the
 installed kernel doesn't have debug support.
 Does 'set dumpdev=' work from the boot loader? I tried some
 combinations with no success.

No.

 I can try and install a 6-STABLE snapshot if there's no way of getting
 the info needed.

You can either try to install a new kernel with DDB support, or follow
the instruction pointer method in the developers handbook chapter on
kernel debugging.


I copied a CURRENT kernel from a 200608 snapshot and the problem also
occurs thus I'm adding [EMAIL PROTECTED]
My current laptop doesn't have a serial port so I'm copying this by hand:

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xc08a1fb7
stack pointer   = 0x28:0xc0c20b14
frame pointer   = 0x28:0xc0c20b9c
code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (swapper)
[thread pid 0 tid 0 ]
Stopped at  __qdivrem+0x3b: divl%ecx,%eax

db bt
Tracing pid 0 tid0 td 0xc0a0c818
__qdivrem(37fdfa0,0,0,0,0,...) at __qdivrem+0x3b
__udivdi3(37fdfa0,0,0,0) at __udivdi3+0x16
ata_raid_promise_read_meta(c37a5000,c09f4a80,1,8086,c37a5000,...) at
ata_raid_promise_read_meta+0x9b
ata_raid_read_metadata(c37a5000,c37a5000,c0c20c70,c06b58a4,c37a5000,...)
at ata_raid_metadata+0x2be
ata_raid_subdisk_attach(c37a5000) at ata_raid_subdisk_attach+0x33
device_attach(c37a5000,c37a5180,c37a5000,c36885c0,0,...) at device_attach+0x58
device_probe_and_attach(c37a5200,c37a5200,c08ec9a9,0,c37a5180,...) at
bus_generic_attach+0x16
ad_attach(c37a5200) at ad_attach+0x2c8
device_attach(c37a5200,c095f2d0,c37a5200,0,c368d800,...) at device_attach+0x58
device_probe_and_attach(c37a5200) at device_probe_and_atach+0xe0
bus_generic_attach(c3659080,c3659080,,0,c37a5200,...) at
bus_generic_attach+0x16
ata_identify(c3659080) at ata_identify+0x1c8
ata_boot_attach(0xc0a11d80,0,c09212e7,47,...) at ata_boot_attach+0x3e
run_interrupt_drive_config_hooks(0,c1ec00,c1e000,0,c0451065,...) at
run_interrupt_drive_config_hooks+0x43
mi_startup() at mi_startup+0x96
begin() at begin+0x2c

This board has a Promise SATA raid controller and it is disabled in
the BIOS. I even tried disabling it through a jumper but it still
stops.

--
Joao Barros
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: integer divide fault on 6.1

2006-09-10 Thread Sam Leffler
Joao Barros wrote:
 On 9/9/06, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Sat, Sep 09, 2006 at 09:02:35PM +0100, Joao Barros wrote:
  On 9/9/06, Max Laier [EMAIL PROTECTED] wrote:
  
  Can you try to get a dump, trace, or at least figure out which
 function
  the IP is refering to?
  
 
  Well, the problem only occurs when I boot from the disk and the
  installed kernel doesn't have debug support.
  Does 'set dumpdev=' work from the boot loader? I tried some
  combinations with no success.

 No.

  I can try and install a 6-STABLE snapshot if there's no way of getting
  the info needed.

 You can either try to install a new kernel with DDB support, or follow
 the instruction pointer method in the developers handbook chapter on
 kernel debugging.
 
 I copied a CURRENT kernel from a 200608 snapshot and the problem also
 occurs thus I'm adding [EMAIL PROTECTED]
 My current laptop doesn't have a serial port so I'm copying this by hand:
 
 Fatal trap 18: integer divide fault while in kernel mode
 cpuid = 0; apic id = 00
 instruction pointer = 0x20:0xc08a1fb7
 stack pointer   = 0x28:0xc0c20b14
 frame pointer   = 0x28:0xc0c20b9c
 code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 0 (swapper)
 [thread pid 0 tid 0 ]
 Stopped at__qdivrem+0x3b: divl%ecx,%eax
 
 db bt
 Tracing pid 0 tid0 td 0xc0a0c818
 __qdivrem(37fdfa0,0,0,0,0,...) at __qdivrem+0x3b
 __udivdi3(37fdfa0,0,0,0) at __udivdi3+0x16
 ata_raid_promise_read_meta(c37a5000,c09f4a80,1,8086,c37a5000,...) at
 ata_raid_promise_read_meta+0x9b
 ata_raid_read_metadata(c37a5000,c37a5000,c0c20c70,c06b58a4,c37a5000,...)
 at ata_raid_metadata+0x2be
 ata_raid_subdisk_attach(c37a5000) at ata_raid_subdisk_attach+0x33
 device_attach(c37a5000,c37a5180,c37a5000,c36885c0,0,...) at
 device_attach+0x58
 device_probe_and_attach(c37a5200,c37a5200,c08ec9a9,0,c37a5180,...) at
 bus_generic_attach+0x16
 ad_attach(c37a5200) at ad_attach+0x2c8
 device_attach(c37a5200,c095f2d0,c37a5200,0,c368d800,...) at
 device_attach+0x58
 device_probe_and_attach(c37a5200) at device_probe_and_atach+0xe0
 bus_generic_attach(c3659080,c3659080,,0,c37a5200,...) at
 bus_generic_attach+0x16
 ata_identify(c3659080) at ata_identify+0x1c8
 ata_boot_attach(0xc0a11d80,0,c09212e7,47,...) at ata_boot_attach+0x3e
 run_interrupt_drive_config_hooks(0,c1ec00,c1e000,0,c0451065,...) at
 run_interrupt_drive_config_hooks+0x43
 mi_startup() at mi_startup+0x96
 begin() at begin+0x2c
 
 This board has a Promise SATA raid controller and it is disabled in
 the BIOS. I even tried disabling it through a jumper but it still
 stops.
 

In sys/dev/ata/ata-raid.h the PROMISE_LBA macro does an unchecked
calculation that apparently can divide by zero.  Soren would likely
understand the root cause of this problem but until then you can patch
the driver to workaround the problem.

Sam

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: integer divide fault on 6.1

2006-09-10 Thread Joao Barros

On 9/10/06, Sam Leffler [EMAIL PROTECTED] wrote:

Joao Barros wrote:
 On 9/9/06, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Sat, Sep 09, 2006 at 09:02:35PM +0100, Joao Barros wrote:
  On 9/9/06, Max Laier [EMAIL PROTECTED] wrote:
  
  Can you try to get a dump, trace, or at least figure out which
 function
  the IP is refering to?
  
 
  Well, the problem only occurs when I boot from the disk and the
  installed kernel doesn't have debug support.
  Does 'set dumpdev=' work from the boot loader? I tried some
  combinations with no success.

 No.

  I can try and install a 6-STABLE snapshot if there's no way of getting
  the info needed.

 You can either try to install a new kernel with DDB support, or follow
 the instruction pointer method in the developers handbook chapter on
 kernel debugging.

 I copied a CURRENT kernel from a 200608 snapshot and the problem also
 occurs thus I'm adding [EMAIL PROTECTED]
 My current laptop doesn't have a serial port so I'm copying this by hand:

 Fatal trap 18: integer divide fault while in kernel mode
 cpuid = 0; apic id = 00
 instruction pointer = 0x20:0xc08a1fb7
 stack pointer   = 0x28:0xc0c20b14
 frame pointer   = 0x28:0xc0c20b9c
 code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 0 (swapper)
 [thread pid 0 tid 0 ]
 Stopped at__qdivrem+0x3b: divl%ecx,%eax

 db bt
 Tracing pid 0 tid0 td 0xc0a0c818
 __qdivrem(37fdfa0,0,0,0,0,...) at __qdivrem+0x3b
 __udivdi3(37fdfa0,0,0,0) at __udivdi3+0x16
 ata_raid_promise_read_meta(c37a5000,c09f4a80,1,8086,c37a5000,...) at
 ata_raid_promise_read_meta+0x9b
 ata_raid_read_metadata(c37a5000,c37a5000,c0c20c70,c06b58a4,c37a5000,...)
 at ata_raid_metadata+0x2be
 ata_raid_subdisk_attach(c37a5000) at ata_raid_subdisk_attach+0x33
 device_attach(c37a5000,c37a5180,c37a5000,c36885c0,0,...) at
 device_attach+0x58
 device_probe_and_attach(c37a5200,c37a5200,c08ec9a9,0,c37a5180,...) at
 bus_generic_attach+0x16
 ad_attach(c37a5200) at ad_attach+0x2c8
 device_attach(c37a5200,c095f2d0,c37a5200,0,c368d800,...) at
 device_attach+0x58
 device_probe_and_attach(c37a5200) at device_probe_and_atach+0xe0
 bus_generic_attach(c3659080,c3659080,,0,c37a5200,...) at
 bus_generic_attach+0x16
 ata_identify(c3659080) at ata_identify+0x1c8
 ata_boot_attach(0xc0a11d80,0,c09212e7,47,...) at ata_boot_attach+0x3e
 run_interrupt_drive_config_hooks(0,c1ec00,c1e000,0,c0451065,...) at
 run_interrupt_drive_config_hooks+0x43
 mi_startup() at mi_startup+0x96
 begin() at begin+0x2c

 This board has a Promise SATA raid controller and it is disabled in
 the BIOS. I even tried disabling it through a jumper but it still
 stops.


In sys/dev/ata/ata-raid.h the PROMISE_LBA macro does an unchecked
calculation that apparently can divide by zero.  Soren would likely
understand the root cause of this problem but until then you can patch
the driver to workaround the problem.

Sam




Thanks for narrowing it down!

--
Joao Barros
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic: integer divide fault on 6.1

2006-09-09 Thread Joao Barros

Hi,

I just installed 6.1 on my new (with old parts) machine and when
booting for the first time after installation I got this panic:

ad0: 19130MB SAMSUNG SV2001H QN200-03 at ata0-master UDMA100


Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xc0853017
stack pointer   = 0x28:0xc0c20b28
frame pointer   = 0x28:0xc0c20bb0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (swapper)
trap number = 18
panic: integer divide fault
cpuid = 0
Uptime: 1s
Cannot dump. No dump device defined

The system is a single Xeon with HTT enabled and the HDD used is
somewhat old. I can try  installing on another one. The swapper
process somehow points me to the HDD.
Of course any clues are most welcome.

--
Joao Barros
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: integer divide fault on 6.1

2006-09-09 Thread Joao Barros

On 9/9/06, Joao Barros [EMAIL PROTECTED] wrote:

Hi,

I just installed 6.1 on my new (with old parts) machine and when
booting for the first time after installation I got this panic:

ad0: 19130MB SAMSUNG SV2001H QN200-03 at ata0-master UDMA100


Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xc0853017
stack pointer   = 0x28:0xc0c20b28
frame pointer   = 0x28:0xc0c20bb0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (swapper)
trap number = 18
panic: integer divide fault
cpuid = 0
Uptime: 1s
Cannot dump. No dump device defined

The system is a single Xeon with HTT enabled and the HDD used is
somewhat old. I can try  installing on another one. The swapper
process somehow points me to the HDD.
Of course any clues are most welcome.



I tried disabling the SATA controller, just leaving the PATA part
enabled with no success.
I even tried disabling HTT and installing on another IDE disk with a
UP kernel rather than a SMP one (I'm trying to eliminate variables)
The odd thing is that everything runs fine during the installation
booting from a CD.

Does anyone have an Asus NCCH-DL motherboard successful booting from
an IDE disk?

--
Joao Barros
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: integer divide fault on 6.1

2006-09-09 Thread Max Laier
On Saturday 09 September 2006 13:56, Joao Barros wrote:
 Hi,

 I just installed 6.1 on my new (with old parts) machine and when
 booting for the first time after installation I got this panic:

 ad0: 19130MB SAMSUNG SV2001H QN200-03 at ata0-master UDMA100


 Fatal trap 18: integer divide fault while in kernel mode
 cpuid = 0; apic id = 00
 instruction pointer   = 0x20:0xc0853017
 stack pointer = 0x28:0xc0c20b28
 frame pointer = 0x28:0xc0c20bb0
 code segment  = base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
 processor eflags  = interrupt enabled, resume, IOPL = 0
 current process   = 0 (swapper)
 trap number   = 18
 panic: integer divide fault
 cpuid = 0
 Uptime: 1s
 Cannot dump. No dump device defined

Can you try to get a dump, trace, or at least figure out which function 
the IP is refering to?

 The system is a single Xeon with HTT enabled and the HDD used is
 somewhat old. I can try  installing on another one. The swapper
 process somehow points me to the HDD.
 Of course any clues are most welcome.

-- 
/\  Best regards,  | [EMAIL PROTECTED]
\ /  Max Laier  | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | [EMAIL PROTECTED]
/ \  ASCII Ribbon Campaign  | Against HTML Mail and News


pgpCLKuw5e8ob.pgp
Description: PGP signature


Re: panic: integer divide fault on 6.1

2006-09-09 Thread Joao Barros

On 9/9/06, Max Laier [EMAIL PROTECTED] wrote:


Can you try to get a dump, trace, or at least figure out which function
the IP is refering to?



Well, the problem only occurs when I boot from the disk and the
installed kernel doesn't have debug support.
Does 'set dumpdev=' work from the boot loader? I tried some
combinations with no success.
I can try and install a 6-STABLE snapshot if there's no way of getting
the info needed.



--
Joao Barros
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: integer divide fault on 6.1

2006-09-09 Thread Kris Kennaway
On Sat, Sep 09, 2006 at 09:02:35PM +0100, Joao Barros wrote:
 On 9/9/06, Max Laier [EMAIL PROTECTED] wrote:
 
 Can you try to get a dump, trace, or at least figure out which function
 the IP is refering to?
 
 
 Well, the problem only occurs when I boot from the disk and the
 installed kernel doesn't have debug support.
 Does 'set dumpdev=' work from the boot loader? I tried some
 combinations with no success.

No.

 I can try and install a 6-STABLE snapshot if there's no way of getting
 the info needed.

You can either try to install a new kernel with DDB support, or follow
the instruction pointer method in the developers handbook chapter on
kernel debugging.

Kris


pgpW9Nmdl5WvQ.pgp
Description: PGP signature