Re: RELENG_7 ata panic on atacontrol attach
Dmitry Morozovsky wrote: On Thu, 2 Apr 2009, Alexander Motin wrote: AM Dmitry Morozovsky wrote: AM On Thu, 2 Apr 2009, Alexander Motin wrote: AM AM AM ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M AM AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM AM ad: ad14 already exists; skipping it^M AM AM ad: ad14 already exists; skipping it^M AM AM ^M AM AM ^M AM AM Fatal trap 12: page fault while in kernel mode^M AM AM AM It looks alike to crash I have already fixed on CURRENT: AM AM http://svn.freebsd.org/changeset/base/188464 AM AM Seems to be. Would you please ask re@ for MFC approval? AM AM This is not actually a fix for original problem, but it may help to avoid AM system crash. Can you confirm that it helps you, as I haven't tested it on AM STABLE yet, I am doing it now. If it helps, I will ask r...@. Well, partially. Machine survived a dozed of detach-remove-insert-attach cycles (which it definitly could not before). Merged. However, it it still paniced on hot-remove-insert (could not dump): Some other hot reinserts finished successfully. It is probably an ATA code problem. I have reworked that part in HEAD. Well, at least now it is significally better that before, if one does not forget to detach ata channel before reinserting the device. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Fri, 3 Apr 2009, Alexander Motin wrote: AM AM This is not actually a fix for original problem, but it may help to avoid AM AM system crash. Can you confirm that it helps you, as I haven't tested it on AM AM STABLE yet, I am doing it now. If it helps, I will ask r...@. AM AM Well, partially. Machine survived a dozed of detach-remove-insert-attach AM cycles (which it definitly could not before). AM AM Merged. Thank you! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Wed, 1 Apr 2009, Alexander Motin wrote: AM AM AM atapci3: nVidia nForce MCP55 SATA300 controller port AM AM AM 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem AM AM 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 AM AM AM AM atacontrol detach ata7 AM AM - insert ATA disk (ad14) AM AM atacontrol attach ata7 AM AM AM pinics with Fatal trap 12: page fault while in kernel mode AM AM AM Any kernel verbose messages before it? AM AM Nope. Just AM AM ata7: [ITHREAD]^M AM ^M AM ^M AM Fatal trap 12: page fault while in kernel mode^M AM AM and approx 15 seconds of wait between ata channel detection and the panic. AM AM Are you sure that you have verbose messages enabled during boot or via AM sysctl? It looks a bit too quiet. Ah, you're right, on this server I turned off boot_verbose. Will recheck. AM RELENG_7 branch ATA maintenance is a bit difficult for me now due to big AM differences from the HEAD. It makes me wish to sync them as there is still AM too much time before 7.x EOL. May be I will do it after the 7.2 release AM process finished. Now is probably not the best time to do it. Fully understood. The only thing I wish to add is that RELENG_7 ata is much more picky in reinitializations (e.g., disk swaps) than RELENG_6. Thank you! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Thu, 2 Apr 2009, Dmitry Morozovsky wrote: DM AM AM AM atapci3: nVidia nForce MCP55 SATA300 controller port DM AM AM DM AM 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem DM AM AM 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 DM AM AM AM AM atacontrol detach ata7 DM AM AM - insert ATA disk (ad14) DM AM AM atacontrol attach ata7 DM AM AM AM pinics with Fatal trap 12: page fault while in kernel mode DM AM AM AM Any kernel verbose messages before it? got it: ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7: [MPSAFE]^M ata7: [ITHREAD]^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M GEOM_LABEL: Label for provider ad14a is ufs/moose09.^M GEOM_LABEL: Label ufs/moose09 removed.^M [-- MARK -- Thu Apr 2 17:00:00 2009] GEOM_LABEL: Label for provider ad14a is ufs/moose09.^M GEOM_LABEL: Label ufs/moose09 removed.^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7: ata7: [MPSAFE]^M CONNECT requested^M ata7: [ITHREAD]^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad: ad14 already exists; skipping it^M ad: ad14 already exists; skipping it^M ^M ^M Fatal trap 12: page fault while in kernel mode^M -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
Dmitry Morozovsky wrote: On Thu, 2 Apr 2009, Dmitry Morozovsky wrote: DM AM AM AM atapci3: nVidia nForce MCP55 SATA300 controller port DM AM AM DM AM 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem DM AM AM 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 DM AM AM AM AM atacontrol detach ata7 DM AM AM - insert ATA disk (ad14) DM AM AM atacontrol attach ata7 DM AM AM AM pinics with Fatal trap 12: page fault while in kernel mode DM AM AM AM Any kernel verbose messages before it? got it: ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7: [MPSAFE]^M ata7: [ITHREAD]^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M GEOM_LABEL: Label for provider ad14a is ufs/moose09.^M GEOM_LABEL: Label ufs/moose09 removed.^M [-- MARK -- Thu Apr 2 17:00:00 2009] GEOM_LABEL: Label for provider ad14a is ufs/moose09.^M GEOM_LABEL: Label ufs/moose09 removed.^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x01 lsb=0x00 msb=0x00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7: ata7: [MPSAFE]^M CONNECT requested^M ata7: [ITHREAD]^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M This looks like race between PATA reset sequence and SATA hot plug events. For AHCI it is handled by masking interrupts during reset. How it is expected to be done here, I am not sure. You can just disable hot plug by commenting respective part of of ata_sata_phy_check_events() function. ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad: ad14 already exists; skipping it^M ad: ad14 already exists; skipping it^M ^M ^M Fatal trap 12: page fault while in kernel mode^M It looks alike to crash I have already fixed on CURRENT: http://svn.freebsd.org/changeset/base/188464 -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Thu, 2 Apr 2009, Alexander Motin wrote: AM ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM ad: ad14 already exists; skipping it^M AM ad: ad14 already exists; skipping it^M AM ^M AM ^M AM Fatal trap 12: page fault while in kernel mode^M AM AM It looks alike to crash I have already fixed on CURRENT: AM http://svn.freebsd.org/changeset/base/188464 Seems to be. Would you please ask re@ for MFC approval? Thanks! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
Dmitry Morozovsky wrote: On Thu, 2 Apr 2009, Alexander Motin wrote: AM ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM ad: ad14 already exists; skipping it^M AM ad: ad14 already exists; skipping it^M AM ^M AM ^M AM Fatal trap 12: page fault while in kernel mode^M AM AM It looks alike to crash I have already fixed on CURRENT: AM http://svn.freebsd.org/changeset/base/188464 Seems to be. Would you please ask re@ for MFC approval? This is not actually a fix for original problem, but it may help to avoid system crash. Can you confirm that it helps you, as I haven't tested it on STABLE yet, I am doing it now. If it helps, I will ask r...@. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Thu, 2 Apr 2009, Alexander Motin wrote: AM Dmitry Morozovsky wrote: AM On Thu, 2 Apr 2009, Alexander Motin wrote: AM AM AM ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M AM AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM AM ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M AM AM ad: ad14 already exists; skipping it^M AM AM ad: ad14 already exists; skipping it^M AM AM ^M AM AM ^M AM AM Fatal trap 12: page fault while in kernel mode^M AM AM AM It looks alike to crash I have already fixed on CURRENT: AM AM http://svn.freebsd.org/changeset/base/188464 AM AM Seems to be. Would you please ask re@ for MFC approval? AM AM This is not actually a fix for original problem, but it may help to avoid AM system crash. Can you confirm that it helps you, as I haven't tested it on AM STABLE yet, I am doing it now. If it helps, I will ask r...@. Well, partially. Machine survived a dozed of detach-remove-insert-attach cycles (which it definitly could not before). However, it it still paniced on hot-remove-insert (could not dump): ata7: DISCONNECT requested^M ata7: DISCONNECTED^M GEOM_LABEL: Label ufs/moose09 removed.^M ata7: CONNECT requested^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=80 ostat1=00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M [a bunch of it] ata7: DISCONNECT requested^M ata7: ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: DISCONNECT requested^M ata7: CONNECT requested^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x80 err=0x00 lsb=0x00 msb=0x00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M ata7: DISCONNECTED^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M ata7: DISCONNECTED^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M ata7: DISCONNECTED^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M ata7: DISCONNECTED^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M ata7: DISCONNECTED^M ata7: CONNECTED^M ata7: SATA connect time=0ms^M ata7: reset tp1 mask=01 ostat0=50 ostat1=00^M ata7: stat0=0x50 err=0x01 lsb=0x00 msb=0x00^M ata7: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER^M ata7-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire^M ad14: 715404MB Seagate ST3750330AS SD04 at ata7-master SATA300^M ad14: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth queue^M GEOM: new disk ad14^M GEOM_LABEL: Label for provider ad14a is ufs/moose09.^M ^M ^M Fatal
Re: RELENG_7 ata panic on atacontrol attach
On Thu, 2 Apr 2009, Dmitry Morozovsky wrote: DM Some other hot reinserts finished successfully. Well, a couple minutes later after hot-reinsert machine paniced again (but now it seems related to ZFS): Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x28 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0836964 stack pointer = 0x28:0xfa466bd4 frame pointer = 0x28:0xfa466be4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 35 (arc_reclaim_thread) trap number = 12 panic: page fault cpuid = 1 Uptime: 11m8s Physical memory: 2039 MB Dumping 300 MB: 285 269 253 237 221 205 189 173 157 141 125 109 93 77 61 45 29 13 (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc0533227 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc0533535 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc06cfce3 in trap_fatal (frame=0xfa466b94, eva=40) at /usr/src/sys/i386/i386/trap.c:939 #4 0xc06cff40 in trap_pfault (frame=0xfa466b94, usermode=0, eva=40) at /usr/src/sys/i386/i386/trap.c:852 #5 0xc06d08e6 in trap (frame=0xfa466b94) at /usr/src/sys/i386/i386/trap.c:530 #6 0xc06b5b5b in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #7 0xc0836964 in avl_destroy_nodes (tree=0xc9a2d540, cookie=0xfa466bf4) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common/avl/avl.c:933 #8 0xc088b873 in mze_destroy (zap=Variable zap is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:175 #9 0xc088bbf3 in zap_evict (db=0xcca1cdac, vzap=0xc9a2d500) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:472 #10 0xc084c818 in dbuf_evict_user (db=0xcca1cdac) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:224 #11 0xc084d655 in dbuf_clear (db=0xcca1cdac) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1281 #12 0xc084d762 in dbuf_evict (db=0xcca1cdac) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:237 #13 0xc084db76 in dbuf_do_evict (private=0xc90d5cbc) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1458 #14 0xc0847517 in arc_do_user_evicts () at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1321 #15 0xc084a584 in arc_reclaim_thread (dummy=0x0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1551 #16 0xc050d877 in fork_exit (callout=0xc084a380 arc_reclaim_thread, arg=0x0, frame=0xfa466d38) at /usr/src/sys/kern/kern_fork.c:810 #17 0xc06b5bd0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
Dmitry Morozovsky wrote: On Wed, 1 Apr 2009, Alexander Motin wrote: AM Dmitry Morozovsky wrote: AM Hi there colleagues, AM AM atapci3: nVidia nForce MCP55 SATA300 controller port AM 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem AM 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 AM AM AM atacontrol detach ata7 AM - insert ATA disk (ad14) AM atacontrol attach ata7 AM AM pinics with Fatal trap 12: page fault while in kernel mode AM AM Any kernel verbose messages before it? Nope. Just ata7: [ITHREAD]^M ^M ^M Fatal trap 12: page fault while in kernel mode^M and approx 15 seconds of wait between ata channel detection and the panic. Are you sure that you have verbose messages enabled during boot or via sysctl? It looks a bit too quiet. RELENG_7 branch ATA maintenance is a bit difficult for me now due to big differences from the HEAD. It makes me wish to sync them as there is still too much time before 7.x EOL. May be I will do it after the 7.2 release process finished. Now is probably not the best time to do it. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RELENG_7 ata panic on atacontrol attach
Hi there colleagues, atapci3: nVidia nForce MCP55 SATA300 controller port 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 atacontrol detach ata7 - insert ATA disk (ad14) atacontrol attach ata7 pinics with Fatal trap 12: page fault while in kernel mode (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc0533227 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc0533535 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc06cfca3 in trap_fatal (frame=0xfcb0aa7c, eva=40) at /usr/src/sys/i386/i386/trap.c:939 #4 0xc06cff00 in trap_pfault (frame=0xfcb0aa7c, usermode=0, eva=40) at /usr/src/sys/i386/i386/trap.c:852 #5 0xc06d08a6 in trap (frame=0xfcb0aa7c) at /usr/src/sys/i386/i386/trap.c:530 #6 0xc06b5b1b in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #7 0xc055b69c in device_attach (dev=0xcc58e480) at /usr/src/sys/kern/subr_bus.c:279 #8 0xc055c96d in device_probe_and_attach (dev=0xcc58e480) at /usr/src/sys/kern/subr_bus.c:2366 #9 0xc055ca59 in bus_generic_attach (dev=0xc5167100) at /usr/src/sys/kern/subr_bus.c:2905 #10 0xc04796f0 in ata_identify (dev=0xc5167100) at /usr/src/sys/dev/ata/ata-all.c:723 #11 0xc0479fe4 in ata_attach (dev=0xc5167100) at /usr/src/sys/dev/ata/ata-all.c:150 #12 0xc047a93a in ata_ioctl (dev=0xc510e200, cmd=2147770627, data=0xd2f6cb80 \a, flag=3, td=0xcd32c690) at /usr/src/sys/dev/ata/ata-all.c:387 #13 0xc04f6497 in giant_ioctl (dev=0xc510e200, cmd=2147770627, data=0xd2f6cb80 \a, fflag=3, td=0xcd32c690) at /usr/src/sys/kern/kern_conf.c:398 #14 0xc04d5f87 in devfs_ioctl_f (fp=0xcce31e8c, com=2147770627, data=0xd2f6cb80, cred=0xc6897700, td=0xcd32c690) at /usr/src/sys/fs/devfs/devfs_vnops.c:602 #15 0xc056d4e5 in kern_ioctl (td=0xcd32c690, fd=3, com=2147770627, data=0xd2f6cb80 \a) at file.h:269 #16 0xc056d63f in ioctl (td=0xcd32c690, uap=0xfcb0acfc) at /usr/src/sys/kern/sys_generic.c:570 #17 0xc06d0248 in syscall (frame=0xfcb0ad38) at /usr/src/sys/i386/i386/trap.c:1090 #18 0xc06b5b80 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #19 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) up 7 #7 0xc055b69c in device_attach (dev=0xcc58e480) at /usr/src/sys/kern/subr_bus.c:279 279 if (dev-sysctl_tree != NULL) -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
Dmitry Morozovsky wrote: Hi there colleagues, atapci3: nVidia nForce MCP55 SATA300 controller port 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 atacontrol detach ata7 - insert ATA disk (ad14) atacontrol attach ata7 pinics with Fatal trap 12: page fault while in kernel mode Any kernel verbose messages before it? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Wed, 1 Apr 2009, Alexander Motin wrote: AM Dmitry Morozovsky wrote: AM Hi there colleagues, AM AM atapci3: nVidia nForce MCP55 SATA300 controller port AM 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem AM 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 AM AM AM atacontrol detach ata7 AM - insert ATA disk (ad14) AM atacontrol attach ata7 AM AM pinics with Fatal trap 12: page fault while in kernel mode AM AM Any kernel verbose messages before it? Nope. Just ata7: [ITHREAD]^M ^M ^M Fatal trap 12: page fault while in kernel mode^M cpuid = 0; apic id = 00^M fault virtual address = 0x28^M fault code = supervisor read, page not present^M instruction pointer = 0x20:0xc055b69c^M stack pointer = 0x28:0xfcb0aabc^M frame pointer = 0x28:0xfcb0aaf8^M code segment= base 0x0, limit 0xf, type 0x1b^M = DPL 0, pres 1, def32 1, gran 1^M processor eflags= interrupt enabled, resume, IOPL = 0^M current process = 3725 (atacontrol)^M trap number = 12^M panic: page fault^M and approx 15 seconds of wait between ata channel detection and the panic. -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 ata panic on atacontrol attach
On Wed, 1 Apr 2009, Dmitry Morozovsky wrote: DM AM atapci3: nVidia nForce MCP55 SATA300 controller port DM AM 0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f mem DM AM 0xefcb3000-0xefcb3fff irq 23 at device 5.2 on pci0 DM AM DM AM DM AM atacontrol detach ata7 DM AM - insert ATA disk (ad14) DM AM atacontrol attach ata7 DM AM DM AM pinics with Fatal trap 12: page fault while in kernel mode DM AM DM AM Any kernel verbose messages before it? DM DM Nope. Just DM DM ata7: [ITHREAD]^M DM ^M DM ^M DM Fatal trap 12: page fault while in kernel mode^M DM cpuid = 0; apic id = 00^M DM fault virtual address = 0x28^M DM fault code = supervisor read, page not present^M DM instruction pointer = 0x20:0xc055b69c^M DM stack pointer = 0x28:0xfcb0aabc^M DM frame pointer = 0x28:0xfcb0aaf8^M DM code segment= base 0x0, limit 0xf, type 0x1b^M DM = DPL 0, pres 1, def32 1, gran 1^M DM processor eflags= interrupt enabled, resume, IOPL = 0^M DM current process = 3725 (atacontrol)^M DM trap number = 12^M DM panic: page fault^M DM DM and approx 15 seconds of wait between ata channel detection and the panic. What I possibly missed is that it is not guaranteed panic, and seems to be dependent on parameters of disk inserted: e.g. I usually have this particular machine panicked with WD320, and no panics with Seagate 7200.11/750G -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 6.1 ata panic if dma enabled
Rong-En Fan [EMAIL PROTECTED] wrote: The ata controller and ad0 is atapci0: VIA 82C686B UDMA100 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0 atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xffa0 ata0: ATA channel 0 on atapci0 atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0 atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6 ata0: reset tp1 mask=03 ostat0=50 ostat1=00 ata0: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 ata0: stat1=0x00 err=0x01 lsb=0x00 msb=0x00 ata0: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER ata0: [MPSAFE] ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad0: setting PIO4 on 82C686B chip ad0: setting UDMA100 on 82C686B chip ad0: 38166MB Seagate ST340016A 3.10 at ata0-master UDMA100 ad0: 78165360 sectors [19158C/16H/255S] 16 sectors/interrupt 1 depth queue I'm pretty sure this HD is capable of UDMA100 (by the specification on Seagate website). The console messages are: /dev/ad0s1e: clean, 823031 free (447 frags, 102823 blocks, 0.0% fragmentation) ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=131647 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=131647 ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED LBA=131647 g_vfs_done():ad0s1a[WRITE(offset=67371008, length=16384)]error = 5 I got a similar problem when I connect my HDD with a DMA33 cable last time when I was trying to install 6.0-PRERELEASE: ad0: 190782MB Seagate ST3200826A 3.03 at ata0-master UDMA100 Trying to mount root from ufs:/dev/ad0s1a ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA = 12623 ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA = 12623 ad0: FAILURE - READ_DMA status = 51 READ, DSC, ERROR error = 84ICRC, ABORTED LBA = 12623 g_vfs_done():ad0s1a[READ(offset = 6430720, length = 4096)] error = 5 vnode_pager_getpages: I/O read error vm_fault: pager read_error, pid 1 (swapper) init died (signal 6, exit 0) panic: Going nowhere without my init ! Uptime: 2s Cannot dump, No dump device defined I also tried to disable DMA to make that disk work, but later I found it was caused by a DMA33 cable. That disk worked fined after I replaced that cable. Regards, loader ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
6.1 ata panic if dma enabled
Hi, Recently, we upgrade a 4.11 box to 6.1-BETA2 by reinstall+newfs everything. After that, we found that if hw.ata.ata_dma=1 at boot, then as soon as it starts fsck -p, it panics. It happens only if ad0 is setted to UDMA66 or above. My current solution is set hw.ata.ata_dma=0 in loader.conf and manually turn DMA on ad0 to UDMA33 and rest ad4~ad7 to UDMA100. In the days of 4.x, there is something wrong with DMA on ad0, but it will fall back to PIO4 automatically without problem. We have been tried to 1) change the cable 2) change from primary ata controller to the second, 3) upgrade to RELENG_6 as of March 11, but all these are failed. There is no options in bios to turn off DMA for the onboard ATA controller. The ata controller and ad0 is atapci0: VIA 82C686B UDMA100 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0 atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xffa0 ata0: ATA channel 0 on atapci0 atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0 atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6 ata0: reset tp1 mask=03 ostat0=50 ostat1=00 ata0: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 ata0: stat1=0x00 err=0x01 lsb=0x00 msb=0x00 ata0: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER ata0: [MPSAFE] ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad0: setting PIO4 on 82C686B chip ad0: setting UDMA100 on 82C686B chip ad0: 38166MB Seagate ST340016A 3.10 at ata0-master UDMA100 ad0: 78165360 sectors [19158C/16H/255S] 16 sectors/interrupt 1 depth queue I'm pretty sure this HD is capable of UDMA100 (by the specification on Seagate website). The console messages are: /dev/ad0s1e: clean, 823031 free (447 frags, 102823 blocks, 0.0% fragmentation) ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=131647 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=131647 ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED LBA=131647 g_vfs_done():ad0s1a[WRITE(offset=67371008, length=16384)]error = 5 [...] kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0x24 fault code = supervisor read, page not present instruction pointer = 0x20:0xc04eef95 stack pointer = 0x28:0xe4c714f0 frame pointer = 0x28:0xe4c71500 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 127 (cp) [thread pid 127 tid 100028 ] Stopped at turnstile_broadcast+0x9:movl0x24(%eax),%eax db bt Tracing pid 127 tid 100028 td 0xc474e000 turnstile_broadcast(0) at turnstile_broadcast+0x9 _mtx_unlock_sleep(c068aa60,0,0,0) at _mtx_unlock_sleep+0x6c softdep_sync_metadata(c4958880) at softdep_sync_metadata+0x7d4 ffs_syncvnode(c4958880,1) at ffs_syncvnode+0x43d ffs_truncate(c4958880,200,0,880,c4695d00,c474e000) at ffs_truncate+0x77e ufs_direnter(c4958880,c49de880,e4c7192c,e4c71bd0,0) at ufs_direnter+0x85d ufs_makeinode(81a4,c4958880,e4c71bbc,e4c71bd0) at ufs_makeinode+0x30f ufs_create(e4c71a84) at ufs_create+0x37 VOP_CREATE_APV(c0670ec0,e4c71a84) at VOP_CREATE_APV+0x3c VOP_CREATE(c4958880,e4c71bbc,e4c71bd0,e4c71ae0) at VOP_CREATE+0x34 vn_open_cred(e4c71ba8,e4c71cc4,1a4,c4695d00,4) at vn_open_cred+0x20c vn_open(e4c71ba8,e4c71cc4,1a4,4) at vn_open+0x29 kern_open(c474e000,804c1c8,0,602,21b6) at kern_open+0xd4 open(c474e000,e4c71cf0) at open+0x22 syscall(3b,3b,3b,8060100,bfbfeec4) at syscall+0x337 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (5, FreeBSD ELF32, open), eip = 0x28137ccf, esp = 0xbfbfec7c, ebp = 0xbfbfecc8 --- db call doadump Cannot dump. No dump device defined. The full dmesg (with boot_verbose) is available at http://www.rafan.org/FreeBSD/ata/20060316-dmesg+db.txt I did a alltrace in ddb: http://www.rafan.org/FreeBSD/ata/20060311-dball.txt Regards, Rong-En Fan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.1 ata panic if dma enabled
Rong-En Fan wrote: Hi, Recently, we upgrade a 4.11 box to 6.1-BETA2 by reinstall+newfs everything. After that, we found that if hw.ata.ata_dma=1 at boot, then as soon as it starts fsck -p, it panics. It happens only if ad0 is setted to UDMA66 or above. My current solution is set hw.ata.ata_dma=0 in loader.conf and manually turn DMA on ad0 to UDMA33 and rest ad4~ad7 to UDMA100. In the days of 4.x, there is something wrong with DMA on ad0, but it will fall back to PIO4 automatically without problem. We have been tried to 1) change the cable 2) change from primary ata controller to the second, 3) upgrade to RELENG_6 as of March 11, but all these are failed. There is no options in bios to turn off DMA for the onboard ATA controller. Please review the release notes from the 6.1-BETA2 announcement. Fixes went into 6.1 shortly after BETA2 was released, and are in BETA3 and BETA4. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.1 ata panic if dma enabled
On 3/16/06, Scott Long [EMAIL PROTECTED] wrote: Rong-En Fan wrote: Hi, Recently, we upgrade a 4.11 box to 6.1-BETA2 by reinstall+newfs everything. After that, we found that if hw.ata.ata_dma=1 at boot, then as soon as it starts fsck -p, it panics. It happens only if ad0 is setted to UDMA66 or above. My current solution is set hw.ata.ata_dma=0 in loader.conf and manually turn DMA on ad0 to UDMA33 and rest ad4~ad7 to UDMA100. In the days of 4.x, there is something wrong with DMA on ad0, but it will fall back to PIO4 automatically without problem. We have been tried to 1) change the cable 2) change from primary ata controller to the second, 3) upgrade to RELENG_6 as of March 11, but all these are failed. There is no options in bios to turn off DMA for the onboard ATA controller. Please review the release notes from the 6.1-BETA2 announcement. Fixes went into 6.1 shortly after BETA2 was released, and are in BETA3 and BETA4. Upgrade to today's RELENG_6, it is the same. I'm not quite if this is hardware problem. But however, why can't ata fall back to PIO4 is DMA write error, just like 4.x does? ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad0: setting PIO4 on 82C686B chip ad0: setting UDMA100 on 82C686B chip ad0: 38166MB Seagate ST340016A 3.10 at ata0-master UDMA100 ad0: 78165360 sectors [19158C/16H/255S] 16 sectors/interrupt 1 depth queue /dev/ad0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1d: clean, 624587 free (28411 frags, 74522 blocks, 1.9% fragmentation) /dev/ad0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1e: clean, 826458 free (466 frags, 103249 blocks, 0.0% fragmentation) ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191 ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191 ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED LBA=191 g_vfs_done():ad0s1a[WRITE(offset=65536, length=2048)]error = 5 mount: /dev/ad0s1a: Input/output error Mounting root filesystem rw failed, startup aborted Boot interrupted Enter root password, or ^D to go multi-user then I just continue..., finally it panics kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0x24 fault code = supervisor read, page not present instruction pointer = 0x20:0xc045 stack pointer = 0x28:0xe4cfb4f0 frame pointer = 0x28:0xe4cfb500 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 168 (cp) [thread pid 168 tid 100044 ] Stopped at turnstile_broadcast+0x9:movl0x24(%eax),%eax db bt Tracing pid 168 tid 100044 td 0xc48de180 turnstile_broadcast(0) at turnstile_broadcast+0x9 _mtx_unlock_sleep(c068aca0,0,0,0) at _mtx_unlock_sleep+0x6c softdep_sync_metadata(c495d660) at softdep_sync_metadata+0x7d4 ffs_syncvnode(c495d660,1) at ffs_syncvnode+0x43d ffs_truncate(c495d660,200,0,880,c4695d00,c48de180) at ffs_truncate+0x77e ufs_direnter(c495d660,c49e1880,e4cfb92c,e4cfbbd0,0) at ufs_direnter+0x85d ufs_makeinode(81a4,c495d660,e4cfbbbc,e4cfbbd0) at ufs_makeinode+0x30f ufs_create(e4cfba84) at ufs_create+0x37 VOP_CREATE_APV(c0671100,e4cfba84) at VOP_CREATE_APV+0x3c VOP_CREATE(c495d660,e4cfbbbc,e4cfbbd0,e4cfbae0) at VOP_CREATE+0x34 vn_open_cred(e4cfbba8,e4cfbcc4,1a4,c4695d00,4) at vn_open_cred+0x20c vn_open(e4cfbba8,e4cfbcc4,1a4,4) at vn_open+0x29 kern_open(c48de180,804c1c8,0,602,21b6) at kern_open+0xd4 open(c48de180,e4cfbcf0) at open+0x22 syscall(3b,3b,3b,8060100,bfbfeec4) at syscall+0x337 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (5, FreeBSD ELF32, open), eip = 0x28137ccf, esp = 0xbfbfec7c, ebp = 0xbfbfecc8 --- Regards, Rong-En Fan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ata panic
Hi, I was trying out a recent RELENG_6 on a VIA mini ITX board with built in CF reader. If a CF is present, the box panics at boot (tried with 2 separate boards and different CFs just in case it was hardware). This is with a RELENG_6 from March 7th with the flash in I get a panic at bootup. ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding enabled, default to accept, logging limited to 9100 packets/entry by default lo0: bpf attached ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad0: setting PIO4 on 8237 chip ad0: setting UDMA100 on 8237 chip ad0: 38166MB Seagate ST340014A 8.54 at ata0-master UDMA100 ad0: 78165360 sectors [77545C/16H/63S] 16 sectors/interrupt 1 depth queue GEOM: new disk ad0 ad0: VIA check1 failed ad0: Adaptec check1 failed ad0: LSI (v3) check1 failed ad0: LSI (v2) check1 failed ad0: FreeBSD check1 failed ata1-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 wire ad2: setting PIO4 on 8237 chip ad2: 244MB SanDisk SDCFB-256 Rev 0.00 at ata1-master PIO4 Fatal trap 18: integer divide fault while in kernel mode instruction pointer = 0x20:0xc0699c37 stack pointer = 0x28:0xc0c20b78 frame pointer = 0x28:0xc0c20c14 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) trap number = 18 panic: integer divide fault KDB: stack backtrace: panic(c06c8ad5,c06f119b,0,0,f) at 0xc0511f23 = panic+0x103 trap_fatal(0,0,0,0,c07333c0) at 0xc06910a5 = trap_fatal+0x225 trap(8,28,28,1,0) at 0xc06915d7 = trap+0x20f calltrap() at 0xc068081a = calltrap+0x5 --- trap 0x12, eip = 0xc0699c37, esp = 0xc0c20b78, ebp = 0xc0c20c14 --- __qdivrem(7a2b0,0,0,0,0) at 0xc0699c37 = __qdivrem+0x3b __udivdi3(7a2b0,0,0,0) at 0xc069a0de = __udivdi3+0x16 ad_attach(c3343c80,c3343c80,c32bd800,0,c0c20d24) at 0xc04684af = ad_attach+0x44f device_attach(c3343c80) at 0xc0526c8e = device_attach+0x1be bus_generic_attach(c3217000,c3217000,,2,c3343c80) at 0xc05278a6 = bus_generic_attach+0x12 ata_identify(c3217000,c3147bd0,c0c20d6c,c0523d00,0) at 0xc045906d = ata_identify+0xcd ata_boot_attach(0,c07206b0,c0c20d88,c04e34c7,0) at 0xc04591e5 = ata_boot_attach+0x4d run_interrupt_driven_config_hooks(0,c31459e8,c1ec00,c1e000,c25000) at 0xc0523d00 = run_interrupt_driven_config_hooks+0x1c mi_startup() at 0xc04e34c7 = mi_startup+0xb3 begin() at 0xc0433e55 = begin+0x2c Uptime: 3s Cannot dump. No dump device defined. Automatic reboot in 15 seconds - press a key on the console to abort Below is a dmesg with the CF slot empty. Its been a while since I tried, but I am pretty sure it used to work Mar 6 17:03:54 ps9996 kernel: Copyright (c) 1992-2006 The FreeBSD Project. Mar 6 17:03:54 ps9996 kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Mar 6 17:03:54 ps9996 kernel: The Regents of the University of California. All rights reserved. Mar 6 17:03:54 ps9996 kernel: FreeBSD 6.1-PRERELEASE #0: Sat Mar 4 07:20:49 EST 2006 Mar 6 17:03:54 ps9996 kernel: [EMAIL PROTECTED]:/usr/obj/usr/src/sys/gas Mar 6 17:03:54 ps9996 kernel: Timecounter i8254 frequency 1193182 Hz quality 0 Mar 6 17:03:54 ps9996 kernel: CPU: VIA C3 Nehemiah+RNG+ACE (796.77-MHz 686-class CPU) Mar 6 17:03:54 ps9996 kernel: Origin = CentaurHauls Id = 0x698 Stepping = 8 Mar 6 17:03:54 ps9996 kernel: Features=0x381b03fFPU,VME,DE,PSE,TSC,MSR,MTRR,PGE,CMOV,PAT,MMX,FXSR,SSE Mar 6 17:03:54 ps9996 kernel: real memory = 517865472 (493 MB) Mar 6 17:03:54 ps9996 kernel: avail memory = 497393664 (474 MB) Mar 6 17:03:54 ps9996 kernel: npx0: [FAST] Mar 6 17:03:54 ps9996 kernel: npx0: math processor on motherboard Mar 6 17:03:54 ps9996 kernel: npx0: INT 16 interface Mar 6 17:03:54 ps9996 kernel: acpi0: CM400 AWRDACPI on motherboard Mar 6 17:03:54 ps9996 kernel: acpi0: Power Button (fixed) Mar 6 17:03:54 ps9996 kernel: Timecounter ACPI-fast frequency 3579545 Hz quality 1000 Mar 6 17:03:54 ps9996 kernel: acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0 Mar 6 17:03:54 ps9996 kernel: cpu0: ACPI CPU on acpi0 Mar 6 17:03:54 ps9996 kernel: acpi_button0: Power Button on acpi0 Mar 6 17:03:54 ps9996 kernel: acpi_button1: Sleep Button on acpi0 Mar 6 17:03:54 ps9996 kernel: pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 Mar 6 17:03:54 ps9996 kernel: pci0: ACPI PCI bus on pcib0 Mar 6 17:03:54 ps9996 kernel: agp0: VIA PM800/PN800/PM880/PN880 host to PCI bridge mem 0xf800-0xf9ff at device 0.0 on pci 0 Mar 6 17:03:54 ps9996 kernel: pcib1: PCI-PCI bridge at device 1.0 on pci0 Mar 6 17:03:54 ps9996 kernel: pci1: PCI bus on pcib1 Mar 6 17:03:54 ps9996 kernel: pci1: display, VGA at device 0.0 (no driver attached) Mar 6 17:03:54 ps9996 kernel: puc0: US Robotics (3Com) 3CP5609 PCI 16550 Modem port 0xe400-0xe407 irq 12 at device 8.0 on pci0 Mar 6 17:03:54
Re: ata panic
At 11:38 PM 13/03/2006, Mike Tancsa wrote: Hi, I was trying out a recent RELENG_6 on a VIA mini ITX board with built in CF reader. If a CF is present, the box panics at boot (tried with 2 separate boards and different CFs just in case it was hardware). This is with a RELENG_6 from March 7th with the flash in I get a panic at bootup. Just updated the source to the latest RELENG_6 in case the changes fixed it, but no dice GEOM: new disk ad0 ad0: VIA check1 failed ad0: Adaptec check1 failed ad0: LSI (v3) check1 failed ad0: LSI (v2) check1 failed ad0: FreeBSD check1 failed ata1-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 wire ad2: setting PIO4 on 8237 chip ad2: 244MB SanDisk SDCFB-256 Rev 0.00 at ata1-master PIO4 Fatal trap 18: integer divide fault while in kernel mode instruction pointer = 0x20:0xc069c01b stack pointer = 0x28:0xc0c20b78 frame pointer = 0x28:0xc0c20c14 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) trap number = 18 panic: integer divide fault KDB: stack backtrace: panic(c06caf91,c06f370c,0,0,f) at 0xc0512343 = panic+0x103 trap_fatal(0,0,0,0,c07359a0) at 0xc0693485 = trap_fatal+0x225 trap(8,28,28,1,0) at 0xc06939b7 = trap+0x20f calltrap() at 0xc0682b6a = calltrap+0x5 --- trap 0x12, eip = 0xc069c01b, esp = 0xc0c20b78, ebp = 0xc0c20c14 --- __qdivrem(7a2b0,0,0,0,0) at 0xc069c01b = __qdivrem+0x3b __udivdi3(7a2b0,0,0,0) at 0xc069c4c2 = __udivdi3+0x16 ad_attach(c3199400,c3199400,c32a5000,0,c0c20d24) at 0xc046870b = ad_attach+0x44f device_attach(c3199400) at 0xc05270ae = device_attach+0x1be bus_generic_attach(c3228380,c3228380,,2,c3199400) at 0xc0527cc6 = bus_generic_attach+0x12 ata_identify(c3228380,0,c0c20d6c,c0524120,0) at 0xc04592c9 = ata_identify+0xcd ata_boot_attach(0,c0722c90,c0c20d88,c04e37f7,0) at 0xc0459441 = ata_boot_attach+0x4d run_interrupt_driven_config_hooks(0,c31459f0,c1ec00,c1e000,c25000) at 0xc0524120 = run_interrupt_driven_config_hooks+0x1c mi_startup() at 0xc04e37f7 = mi_startup+0xb3 begin() at 0xc0434095 = begin+0x2c Uptime: 3s Cannot dump. No dump device defined. Automatic reboot in 15 seconds - press a key on the console to abort -- Press a key on the console to reboot, -- or switch off the system now. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ata panic
Mike Tancsa wrote: At 11:38 PM 13/03/2006, Mike Tancsa wrote: Hi, I was trying out a recent RELENG_6 on a VIA mini ITX board with built in CF reader. If a CF is present, the box panics at boot (tried with 2 separate boards and different CFs just in case it was hardware). This is with a RELENG_6 from March 7th with the flash in I get a panic at bootup. Just updated the source to the latest RELENG_6 in case the changes fixed it, but no dice Hmm, thats not the intended behavior :) Thanks for the report, I'll look into this ASAP! -Søren GEOM: new disk ad0 ad0: VIA check1 failed ad0: Adaptec check1 failed ad0: LSI (v3) check1 failed ad0: LSI (v2) check1 failed ad0: FreeBSD check1 failed ata1-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 wire ad2: setting PIO4 on 8237 chip ad2: 244MB SanDisk SDCFB-256 Rev 0.00 at ata1-master PIO4 Fatal trap 18: integer divide fault while in kernel mode instruction pointer = 0x20:0xc069c01b stack pointer = 0x28:0xc0c20b78 frame pointer = 0x28:0xc0c20c14 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) trap number = 18 panic: integer divide fault KDB: stack backtrace: panic(c06caf91,c06f370c,0,0,f) at 0xc0512343 = panic+0x103 trap_fatal(0,0,0,0,c07359a0) at 0xc0693485 = trap_fatal+0x225 trap(8,28,28,1,0) at 0xc06939b7 = trap+0x20f calltrap() at 0xc0682b6a = calltrap+0x5 --- trap 0x12, eip = 0xc069c01b, esp = 0xc0c20b78, ebp = 0xc0c20c14 --- __qdivrem(7a2b0,0,0,0,0) at 0xc069c01b = __qdivrem+0x3b __udivdi3(7a2b0,0,0,0) at 0xc069c4c2 = __udivdi3+0x16 ad_attach(c3199400,c3199400,c32a5000,0,c0c20d24) at 0xc046870b = ad_attach+0x44f device_attach(c3199400) at 0xc05270ae = device_attach+0x1be bus_generic_attach(c3228380,c3228380,,2,c3199400) at 0xc0527cc6 = bus_generic_attach+0x12 ata_identify(c3228380,0,c0c20d6c,c0524120,0) at 0xc04592c9 = ata_identify+0xcd ata_boot_attach(0,c0722c90,c0c20d88,c04e37f7,0) at 0xc0459441 = ata_boot_attach+0x4d run_interrupt_driven_config_hooks(0,c31459f0,c1ec00,c1e000,c25000) at 0xc0524120 = run_interrupt_driven_config_hooks+0x1c mi_startup() at 0xc04e37f7 = mi_startup+0xb3 begin() at 0xc0434095 = begin+0x2c Uptime: 3s Cannot dump. No dump device defined. Automatic reboot in 15 seconds - press a key on the console to abort -- Press a key on the console to reboot, -- or switch off the system now. . ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ata panic
At 01:37 AM 14/03/2006, Søren Schmidt wrote: Mike Tancsa wrote: At 11:38 PM 13/03/2006, Mike Tancsa wrote: Hi, I was trying out a recent RELENG_6 on a VIA mini ITX board with built in CF reader. If a CF is present, the box panics at boot (tried with 2 separate boards and different CFs just in case it was hardware). This is with a RELENG_6 from March 7th with the flash in I get a panic at bootup. Just updated the source to the latest RELENG_6 in case the changes fixed it, but no dice Hmm, thats not the intended behavior :) Thanks for the report, I'll look into this ASAP! Thanks! I also just confirmed a kernel from Feb 1 boots up OK, but not with boot -v ?? eg here is a regular boot from the Feb 1 kernel WARNING: WITNESS option enabled, expect reduced performance. Timecounter i8254 frequency 1193182 Hz quality 0 CPU: VIA C3 Nehemiah+RNG+ACE (796.77-MHz 686-class CPU) atapci0: VIA 6420 SATA150 controller port 0xeb00-0xeb07,0xe000-0xe003,0xe100-0xe107,0xe200-0xe203,0xe300-0xe30f,0xd400-0xd4ff irq 10 at device 15.0 on pci0 ata2: ATA channel 0 on atapci0 ata3: ATA channel 1 on atapci0 atapci1: VIA 8237 UDMA133 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe500-0xe50f at device 15.1 on pci0 ata0: ATA channel 0 on atapci1 ata1: ATA channel 1 on atapci1 Timecounters tick every 1.000 msec Fast IPsec: Initialized Security Association Processing. ad0: 38166MB Seagate ST340014A 8.54 at ata0-master UDMA100 ad2: 244MB SanDisk SDCFB-256 Rev 0.00 at ata1-master PIO4 Trying to mount root from ufs:/dev/ad0s1a where as boot -v gives ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad0: setting PIO4 on 8237 chip ad0: setting UDMA100 on 8237 chip ad0: 38166MB Seagate ST340014A 8.54 at ata0-master UDMA100 ad0: 78165360 sectors [77545C/16H/63S] 16 sectors/interrupt 1 depth queue ata1-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 wire ad2: setting PIO4 on 8237 chip ad2: 244MB SanDisk SDCFB-256 Rev 0.00 at ata1-master PIO4 Fatal trap 18: integer divide fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xc06d7637 stack pointer = 0x28:0xc0c20b64 frame pointer = 0x28:0xc0c20bec code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) [thread pid 0 tid 0 ] Stopped at __qdivrem+0x3b: divl%ecx,%eax db db bt Tracing pid 0 tid 0 td 0xc078dac0 __qdivrem(7a2b0,0,0,0,0) at __qdivrem+0x3b __udivdi3(7a2b0,0,0,0) at __udivdi3+0x16 ad_describe(c339f700,c339f700,c32df3a0,c33bd800,c31a3600) at ad_describe+0x1b3 ad_attach(c339f700) at ad_attach+0x1e7 device_attach(c339f700,c0c20d28,c339f700,0,c33bd800) at device_attach+0x58 device_probe_and_attach(c339f700) at device_probe_and_attach+0xe0 bus_generic_attach(c328b280,c328b280,,2,c339f700) at bus_generic_attach+0x16 ata_identify(c328b280) at ata_identify+0x1c8 ata_boot_attach(0) at ata_boot_attach+0x3e run_interrupt_driven_config_hooks(0,c1ec00,c1e000,0,c043b215) at run_interrupt_driven_config_hooks+0x18 mi_startup() at mi_startup+0x96 begin() at begin+0x2c db Tracing pid 0 tid 0 td 0xc078dac0 __qdivrem(7a2b0,0,0,0,0) at __qdivrem+0x3b __udivdi3(7a2b0,0,0,0) at __udivdi3+0x16 ad_describe(c339f700,c339f700,c32df3a0,c33bd800,c31a3600) at ad_describe+0x1b3 ad_attach(c339f700) at ad_attach+0x1e7 device_attach(c339f700,c0c20d28,c339f700,0,c33bd800) at device_attach+0x58 device_probe_and_attach(c339f700) at device_probe_and_attach+0xe0 bus_generic_attach(c328b280,c328b280,,2,c339f700) at bus_generic_attach+0x16 ata_identify(c328b280) at ata_identify+0x1c8 ata_boot_attach(0) at ata_boot_attach+0x3e run_interrupt_driven_config_hooks(0,c1ec00,c1e000,0,c043b215) at run_interrupt_driven_config_hooks+0x18 mi_startup() at mi_startup+0x96 begin() at begin+0x2c db ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
On 2005.02.27 11:56:06 -0800, Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. I don think it's in RELENG_5. The commit that (re)introduced the problem haven't been MFC'ed as far as I know, but I can't test it at the moment. It should also be noted that ATA MKIII also fixes this problem, at least it did for me. -- Simon L. Nielsen pgpYqbKoPZ0At.pgp Description: PGP signature
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Søren Schmidt wrote: Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion.ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) Nope, but I'm glad you can corroborate these fixes are correct. -- Nate ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion. ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) -- -Søren ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Nate Lawson wrote: Søren Schmidt wrote: Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion.ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) Nope, but I'm glad you can corroborate these fixes are correct. Actually I cant, I havn't looked at what was committed since I already did fix these problems in the mkIII patches floating around.. Anyhow its in there and the committer has to deal with it until/if I commit mkIII to -current, I'm out of the loop until then... -- -Søren ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Søren Schmidt wrote: Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion.ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) Nope, but I'm glad you can corroborate these fixes are correct. -- Nate ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion. ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) -- -Søren ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Søren Schmidt wrote: Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion.ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) Nope, but I'm glad you can corroborate these fixes are correct. -- Nate ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Søren Schmidt wrote: Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion.ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) Nope, but I'm glad you can corroborate these fixes are correct. -- Nate ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
Nate Lawson wrote: Søren Schmidt wrote: Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion.ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Just a note from here, these bugs are fixed in ATA mkIII so you could just have gleaned the solution from there (or maybe you did :)) Nope, but I'm glad you can corroborate these fixes are correct. Actually I cant, I havn't looked at what was committed since I already did fix these problems in the mkIII patches floating around.. Anyhow its in there and the committer has to deal with it until/if I commit mkIII to -current, I'm out of the loop until then... -- -Søren ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
patch: fix ata panic with Thinkpad CD and DVD drives
If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. The bug is triggered by timeouts in the ata_getparam() probe path. The ata_timeout() fires and ata_end_transaction() is called to get the status. However, it continues down into ata_pio_read() even though there is no data available since we had a timeout, not read completion. ata_pio_read() reads 512 bytes of probably bogus data. The important problem is that it also advances donecount. On subsequent timeouts (note there are 4 below), donecount advances into unallocated memory and so subsequent ata_pio_read() calls overwrite 512 bytes of someone else's memory. The fix is to exit immediately if ATA_R_TIMEOUT is set after reading the status in ata_end_transaction(). It shouldn't go into ata_pio_read() if there was a timeout. The patch does this. However, it only handles PIO timeouts since I wasn't sure the best way to proceed for unwinding DMA state and the like for the other cases. This is enough to fix the overwrite and subsequent panic on my systems. I've run heavy IO stress and DVD accesses for a while and no further panics. While looking into this, I found another potential problem. In one reinjection case, donecount wasn't reset to 0. The patch for ata-queue.c does this and I think it's necessary but don't hit this case in testing so I can't be sure. Finally, there's one whitespace nit that helps with clarity. These are similar bugs to one found back in August that had the same effect. Here's the closest reference I could find in the mail archives for this: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/033033.html Please fix this before 5.4-R, thanks. Here is the hardware in question. This bug is triggered by various CD, DVD, CDRW, etc. drives shipped with Thinkpads. atapci0: Intel ICH3 UDMA100 controller port 0x1860-0x186f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 ad0: 19077MB IC25N020ATMR04-0/MO1OAD4A [41344/15/63] at ata0-master UDMA100 ata1-slave: FAILURE - ATAPI_IDENTIFY timed out ata1-slave: FAILURE - ATAPI_IDENTIFY timed out ata1-slave: FAILURE - ATAPI_IDENTIFY timed out ata1-slave: FAILURE - ATAPI_IDENTIFY timed out acd0: FAILURE - SETFEATURES SET TRANSFER MODE timed out acd0: DVDROM HL-DT-STDVD-ROM GDR8081N/0012 at ata1-master UDMA33 -- Nate Index: sys/dev/ata/ata-lowlevel.c === RCS file: /home/ncvs/src/sys/dev/ata/ata-lowlevel.c,v retrieving revision 1.51 diff -u -r1.51 ata-lowlevel.c --- sys/dev/ata/ata-lowlevel.c 24 Dec 2004 13:38:25 - 1.51 +++ sys/dev/ata/ata-lowlevel.c 27 Feb 2005 19:23:09 - @@ -297,6 +297,9 @@ /* ATA PIO data transfer and control commands */ default: + /* XXX Doesn't handle the non-PIO case. */ + if (request-flags ATA_R_TIMEOUT) + return ATA_OP_FINISHED; /* on control commands read back registers to the request struct */ if (request-flags ATA_R_CONTROL) { @@ -321,7 +324,7 @@ ata_pio_read(request, request-transfersize); /* update how far we've gotten */ - request-donecount += request-transfersize; + request-donecount += request-transfersize; /* do we need a scoop more ? */ if (request-bytecount request-donecount) { Index: sys/dev/ata/ata-queue.c === RCS file: /home/ncvs/src/sys/dev/ata/ata-queue.c,v retrieving revision 1.41 diff -u -r1.41 ata-queue.c --- sys/dev/ata/ata-queue.c 8 Dec 2004 11:16:33 - 1.41 +++ sys/dev/ata/ata-queue.c 27 Feb 2005 19:22:16 - @@ -249,6 +249,7 @@ request-device-param){ request-flags = ~(ATA_R_TIMEOUT | ATA_R_DEBUG); request-flags |= (ATA_R_IMMEDIATE | ATA_R_REQUEUE); + request-donecount = 0; ATA_DEBUG_RQ(request, completed reinject); ata_queue_request(request); return; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: fix ata panic with Thinkpad CD and DVD drives
On 2005.02.27 11:56:06 -0800, Nate Lawson wrote: If you've been having memory modified after free panics on -current and have a Thinkpad, the attached patch should fix things for you. A quick check of RELENG_5 indicates that the bug is probably there also but I haven't tested for it there. I don think it's in RELENG_5. The commit that (re)introduced the problem haven't been MFC'ed as far as I know, but I can't test it at the moment. It should also be noted that ATA MKIII also fixes this problem, at least it did for me. -- Simon L. Nielsen pgpCIdr7HLZh6.pgp Description: PGP signature