Re: splassert: vwakeup: and friends
On Sat, 26 Dec 2009, frantisek holop wrote: ok. no more splassert, but everything remains the same. i just came back from another hard lock up after trying to copy files from one external usb disk to another. at some random point the copying just dies and then first the processes doing anyhing with disks and then gradually the whole system locks up. nothing in the logs, no dmesg, nothing. Set ddb.console=1 in /etc/sysctl.conf and break to the debugger with Ctrl-Alt-Esc once the lockup happens. trace and ps there, then continue, break to debugger again, etc. This could give you some insight what's going on. Not exactly scientific solution, but easy and quick as the first attempt. Regards, David
Re: splassert: vwakeup: and friends
On Dec 24 20:02:51, frantisek holop wrote: hmm, on Wed, Dec 23, 2009 at 02:05:57PM +0100, frantisek holop said that background: i am trying to run an e2fsck on an 120G ext2 partition on that usb external disk and it just stops reading from the disk at random positions. for example it gets up to 30% of pass 1, then it just stops. no read errors in dmesg, nohing just idling. eventually the disk spins down. top says it's in 'biowait'. (it could also be a bug in e2fsck) i have upgraded to the latest snapshot, and indeed the splassert goes away. (but e2fsck still does not finish the disk, it stops reading at random positions of pass 1 (30%, 11%) make sure the disk itself is OK before blaming anything higher up. (e.g., read the entire disk with dd) and sits there doing nothing.) nothing meaning which process state, really?
Re: splassert: vwakeup: and friends
hmm, on Fri, Dec 25, 2009 at 07:44:45PM +0100, Jan Stary said that make sure the disk itself is OK before blaming anything higher up. (e.g., read the entire disk with dd) openbsd's fsck goes through with no problems. eventually i managed to get e2fsck through as well: i made a script to touch a file every 9 seconds so the disk doesn't spin down and for some reason e2fsck finished. but i am not convinced this is the actual reason, e2fsck was never idle for 10s in the first place, it was reading through the disk after all. and if there were problems spinning up the disk, i am sure dmesg or some other layer of the system would have told me. i am far beyond the point of pointing fingers (disk, OS, etc), i just want it to work deterministically, so my mails to misc don't look like dali paintings. if it has errors, i should see them being reported. if not, it should work. i am somewhere inbetween. and sits there doing nothing.) nothing meaning which process state, really? sleep/biowait -f -- whatever you are, be a good one. -- abraham lincoln.
Re: splassert: vwakeup: and friends
hmm, on Thu, Dec 24, 2009 at 08:02:51PM +0100, frantisek holop said that hmm, on Wed, Dec 23, 2009 at 02:05:57PM +0100, frantisek holop said that background: i am trying to run an e2fsck on an 120G ext2 partition on that usb external disk and it just stops reading from the disk at random positions. for example it gets up to 30% of pass 1, then it just stops. no read errors in dmesg, nohing just idling. eventually the disk spins down. top says it's in 'biowait'. (it could also be a bug in e2fsck) i have upgraded to the latest snapshot, and indeed the splassert goes away. (but e2fsck still does not finish the disk, it stops reading at random positions of pass 1 (30%, 11%) and sits there doing nothing.) ok. no more splassert, but everything remains the same. i just came back from another hard lock up after trying to copy files from one external usb disk to another. at some random point the copying just dies and then first the processes doing anyhing with disks and then gradually the whole system locks up. nothing in the logs, no dmesg, nothing. i have just copied the same files over without any problem using my parents' notebook. so it's either the bios/hw/usb ports or openbsd's scsi/usb layer. could someone help me please to create the most verbose kernel possible (scsi+usb) and combined with some remote syslog hopefully some of the logs will be readable? -f -- artificial intelligence: the other guy's opinion.
Re: splassert: vwakeup: and friends
hmm, on Wed, Dec 23, 2009 at 02:05:57PM +0100, frantisek holop said that background: i am trying to run an e2fsck on an 120G ext2 partition on that usb external disk and it just stops reading from the disk at random positions. for example it gets up to 30% of pass 1, then it just stops. no read errors in dmesg, nohing just idling. eventually the disk spins down. top says it's in 'biowait'. (it could also be a bug in e2fsck) i have upgraded to the latest snapshot, and indeed the splassert goes away. (but e2fsck still does not finish the disk, it stops reading at random positions of pass 1 (30%, 11%) and sits there doing nothing.) -f -- save a tree. eat a beaver.
Re: splassert: vwakeup: and friends
can you tell me what version of src/sys/scsi/sd.c you are running? cheers, dlg On 23/12/2009, at 12:37 PM, frantisek holop wrote: hi there, i was having difficulties reproducing this (as expected probably) but i managed to get one trace: splassert: biodone: want 80 have 0 Starting stack trace... splassert_check(50,0,d074ff87,0) at splassert_check+0x46 splassert_check(50,d074ff87,d9655d2c,d340fe00) at splassert_check+0x46 biodone(d9ab12fc,d08c9b30,d16e1e00,d16e1e00) at biodone+0x20 sd_kill_buffers(d340fe00,,5,d16fb080) at sd_kill_buffers+0x33 sdactivate(d340fe00,1,4,d1450b00) at sdactivate+0x27 config_deactivate(d340fe00,1,0,1,0) at config_deactivate+0x39 scsi_activate_target(d3647800,1,1,2,0) at scsi_activate_target+0x2f scsi_activate_bus(d3647800,1,d9655dfc,d067e46c) at scsi_activate_bus+0x29 scsi_activate(d3647800,,,1,d16e1e00) at scsi_activate+0x65 scsibusactivate(d3647800,1,d9655e4c,d067dd0f) at scsibusactivate+0x15 config_deactivate(d3647800,d16e1e00,d9655e8c,d067e5db,0) at config_deactivate+0x39 umass_activate(d340f000,1,d9655ebc,d340f000) at umass_activate+0x3e config_deactivate(d340f000,d340f014,10,d067e54b) at config_deactivate+0x39 config_detach(d340f000,1,d9655f0c,d067eac4,d1374780) at config_detach+0x23b usb_disconnect_port(d138c918,d1450a80,10) at usb_disconnect_port+0x65 uhub_explore(d1374780,d067cba4,d9655f8c,d067cc59,0) at uhub_explore+0x205 usb_discover(d1374800,1a4,d0200928,d5aca580,d5aca6e0) at usb_discover+0x36 usb_event_thread(d1374800) at usb_event_thread+0x91 Bad frame pointer: 0xd0a28e78 End of stack trace. but i am afraid this is being caused by a dying disk... -f -- last week i couldn't even spell engineer, now i are one.
Re: splassert: vwakeup: and friends
hmm, on Wed, Dec 23, 2009 at 08:50:01PM +1000, David Gwynne said that can you tell me what version of src/sys/scsi/sd.c you are running? the snapshot being used is from Dec 4, so i'd guess it is Revision 1.169 background: i am trying to run an e2fsck on an 120G ext2 partition on that usb external disk and it just stops reading from the disk at random positions. for example it gets up to 30% of pass 1, then it just stops. no read errors in dmesg, nohing just idling. eventually the disk spins down. top says it's in 'biowait'. (it could also be a bug in e2fsck) this is a go-between disk between a windows machine and openbsd. windows seems to have no problems with it so far.. another strange thing is, that while the partition isn't clean (i can never finish the fsck), it is being mounted by mount from hotplugd when i connect it.. -f -- i couldn't repair your brakes so i made your horn louder.
Re: splassert: vwakeup: and friends
hi there, i was having difficulties reproducing this (as expected probably) but i managed to get one trace: splassert: biodone: want 80 have 0 Starting stack trace... splassert_check(50,0,d074ff87,0) at splassert_check+0x46 splassert_check(50,d074ff87,d9655d2c,d340fe00) at splassert_check+0x46 biodone(d9ab12fc,d08c9b30,d16e1e00,d16e1e00) at biodone+0x20 sd_kill_buffers(d340fe00,,5,d16fb080) at sd_kill_buffers+0x33 sdactivate(d340fe00,1,4,d1450b00) at sdactivate+0x27 config_deactivate(d340fe00,1,0,1,0) at config_deactivate+0x39 scsi_activate_target(d3647800,1,1,2,0) at scsi_activate_target+0x2f scsi_activate_bus(d3647800,1,d9655dfc,d067e46c) at scsi_activate_bus+0x29 scsi_activate(d3647800,,,1,d16e1e00) at scsi_activate+0x65 scsibusactivate(d3647800,1,d9655e4c,d067dd0f) at scsibusactivate+0x15 config_deactivate(d3647800,d16e1e00,d9655e8c,d067e5db,0) at config_deactivate+0x39 umass_activate(d340f000,1,d9655ebc,d340f000) at umass_activate+0x3e config_deactivate(d340f000,d340f014,10,d067e54b) at config_deactivate+0x39 config_detach(d340f000,1,d9655f0c,d067eac4,d1374780) at config_detach+0x23b usb_disconnect_port(d138c918,d1450a80,10) at usb_disconnect_port+0x65 uhub_explore(d1374780,d067cba4,d9655f8c,d067cc59,0) at uhub_explore+0x205 usb_discover(d1374800,1a4,d0200928,d5aca580,d5aca6e0) at usb_discover+0x36 usb_event_thread(d1374800) at usb_event_thread+0x91 Bad frame pointer: 0xd0a28e78 End of stack trace. but i am afraid this is being caused by a dying disk... -f -- last week i couldn't even spell engineer, now i are one.
splassert: vwakeup: and friends
hi there, i am having difficulties copying from one external usb device to the other. the copying stops at certain point and the target device stops responding. /var/log/messages: Dec 14 23:14:07 amaaq /bsd: umass0 at uhub0 Dec 14 23:14:07 amaaq /bsd: port 2 configuration 1 interface 0 Seagate USB Mass Storage rev 2.00/0.02 addr 2 Dec 14 23:14:07 amaaq /bsd: umass0: using SCSI over Bulk-Only Dec 14 23:14:07 amaaq /bsd: scsibus1 at umass0: 2 targets, initiator 0 Dec 14 23:14:07 amaaq /bsd: sd0 at scsibus1 targ 1 lun 0: ST916082, 1A, SCSI0 0/direct fixed Dec 14 23:14:07 amaaq /bsd: sd0: 152627MB, 512 bytes/sec, 312581808 sec total Dec 15 02:45:27 amaaq /bsd: umass1 at uhub0 Dec 15 02:45:27 amaaq /bsd: port 4 configuration 1 interface 0 SanDisk Corporation U3 Cruzer Micro rev 2.00/0.10 addr 3 Dec 15 02:45:27 amaaq /bsd: umass1: using SCSI over Bulk-Only Dec 15 02:45:27 amaaq /bsd: scsibus2 at umass1: 2 targets, initiator 0 Dec 15 02:45:27 amaaq /bsd: sd1 at scsibus2 targ 1 lun 0: SanDisk, U3 Cruzer Micro, 4.05 SCSI2 0/direct removable Dec 15 02:45:27 amaaq /bsd: sd1: 3919MB, 512 bytes/sec, 8027789 sec total ... Dec 15 02:54:42 amaaq /bsd: e: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 ... Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: sd1 detached Dec 15 02:54:48 amaaq /bsd: scsibus2 detached Dec 15 02:54:48 amaaq /bsd: umass1 detached at which point i removed the device. what are these the symptoms of? are my disks dying? OpenBSD 4.6-current (GENERIC) #447: Fri Dec 4 22:50:41 MST 2009 dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Intel(R) Celeron(R) M processor 900MHz (GenuineIntel 686-class) 631 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,TM,SBF real mem = 527527936 (503MB) avail mem = 502505472 (479MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 05/16/08, BIOS32 rev. 0 @ 0xf0010, SMBIOS rev. 2.5 @ 0xf06e0 (37 entries) bios0: vendor American Megatrends Inc. version 1101 date 05/16/2008 bios0: ASUSTeK Computer INC. 701 acpi0 at bios0: rev 0 acpi0: tables DSDT FACP APIC OEMB MCFG acpi0: wakeup devices P0P3(S4) P0P4(S4) P0P5(S4) P0P6(S4) P0P7(S4) MC97(S4) USB1(S3) USB2(S3) USB3(S3) USB4(S3) EUSB(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: apic clock running at 70MHz ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 4 (P0P3) acpiprt2 at acpi0: bus -1 (P0P5) acpiprt3 at acpi0: bus 1 (P0P6) acpiec0 at acpi0 acpicpu0 at acpi0: C3, C2 acpitz0 at acpi0: critical temperature 90 degC acpibat0 at acpi0: BAT0 model 701 serial type LION oem ASUS acpiac0 at acpi0: AC unit offline acpiasus0 at acpi0 acpibtn0 at acpi0: LID_ acpibtn1 at acpi0: SLPB acpibtn2 at acpi0: PWRB acpivideo0 at acpi0: VGA_ acpivout0 at acpivideo0: CRTD acpivout1 at acpivideo0: TVOD acpivout2 at acpivideo0: LCDD bios0: ROM list: 0xc/0xf800! pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 Intel 82915GM Host rev 0x04 vga1 at pci0 dev 2 function 0 Intel 82915GM Video rev 0x04 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) intagp0 at vga1 agp0 at intagp0: aperture at 0xd000, size 0x1000 inteldrm0 at vga1: apic 1 int 16 (irq 11) drm0 at inteldrm0 Intel 82915GM Video rev 0x04 at pci0 dev 2 function 1 not configured azalia0 at pci0 dev 27 function 0 Intel 82801FB HD Audio rev 0x04: apic 1 int 16 (irq 11) azalia0: codecs: Realtek ALC662 audio0 at azalia0 ppb0 at pci0 dev 28 function 0 Intel 82801FB PCIE rev 0x04: apic 1 int 16 (irq 11) pci1 at ppb0 bus 3 ppb1 at pci0 dev 28 function 2 Intel 82801FB PCIE rev 0x04: apic 1 int
Re: splassert: vwakeup: and friends
On Tue, Dec 15, 2009 at 03:22:56AM +0100, frantisek holop wrote: hi there, i am having difficulties copying from one external usb device to the other. the copying stops at certain point and the target device stops responding. /var/log/messages: Dec 14 23:14:07 amaaq /bsd: umass0 at uhub0 Dec 14 23:14:07 amaaq /bsd: port 2 configuration 1 interface 0 Seagate USB Mass Storage rev 2.00/0.02 addr 2 Dec 14 23:14:07 amaaq /bsd: umass0: using SCSI over Bulk-Only Dec 14 23:14:07 amaaq /bsd: scsibus1 at umass0: 2 targets, initiator 0 Dec 14 23:14:07 amaaq /bsd: sd0 at scsibus1 targ 1 lun 0: ST916082, 1A, SCSI0 0/direct fixed Dec 14 23:14:07 amaaq /bsd: sd0: 152627MB, 512 bytes/sec, 312581808 sec total Dec 15 02:45:27 amaaq /bsd: umass1 at uhub0 Dec 15 02:45:27 amaaq /bsd: port 4 configuration 1 interface 0 SanDisk Corporation U3 Cruzer Micro rev 2.00/0.10 addr 3 Dec 15 02:45:27 amaaq /bsd: umass1: using SCSI over Bulk-Only Dec 15 02:45:27 amaaq /bsd: scsibus2 at umass1: 2 targets, initiator 0 Dec 15 02:45:27 amaaq /bsd: sd1 at scsibus2 targ 1 lun 0: SanDisk, U3 Cruzer Micro, 4.05 SCSI2 0/direct removable Dec 15 02:45:27 amaaq /bsd: sd1: 3919MB, 512 bytes/sec, 8027789 sec total ... Dec 15 02:54:42 amaaq /bsd: e: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 ... Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: sd1 detached Dec 15 02:54:48 amaaq /bsd: scsibus2 detached Dec 15 02:54:48 amaaq /bsd: umass1 detached at which point i removed the device. what are these the symptoms of? are my disks dying? OpenBSD 4.6-current (GENERIC) #447: Fri Dec 4 22:50:41 MST 2009 dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Intel(R) Celeron(R) M processor 900MHz (GenuineIntel 686-class) 631 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,TM,SBF real mem = 527527936 (503MB) avail mem = 502505472 (479MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 05/16/08, BIOS32 rev. 0 @ 0xf0010, SMBIOS rev. 2.5 @ 0xf06e0 (37 entries) bios0: vendor American Megatrends Inc. version 1101 date 05/16/2008 bios0: ASUSTeK Computer INC. 701 acpi0 at bios0: rev 0 acpi0: tables DSDT FACP APIC OEMB MCFG acpi0: wakeup devices P0P3(S4) P0P4(S4) P0P5(S4) P0P6(S4) P0P7(S4) MC97(S4) USB1(S3) USB2(S3) USB3(S3) USB4(S3) EUSB(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: apic clock running at 70MHz ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 4 (P0P3) acpiprt2 at acpi0: bus -1 (P0P5) acpiprt3 at acpi0: bus 1 (P0P6) acpiec0 at acpi0 acpicpu0 at acpi0: C3, C2 acpitz0 at acpi0: critical temperature 90 degC acpibat0 at acpi0: BAT0 model 701 serial type LION oem ASUS acpiac0 at acpi0: AC unit offline acpiasus0 at acpi0 acpibtn0 at acpi0: LID_ acpibtn1 at acpi0: SLPB acpibtn2 at acpi0: PWRB acpivideo0 at acpi0: VGA_ acpivout0 at acpivideo0: CRTD acpivout1 at acpivideo0: TVOD acpivout2 at acpivideo0: LCDD bios0: ROM list: 0xc/0xf800! pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 Intel 82915GM Host rev 0x04 vga1 at pci0 dev 2 function 0 Intel 82915GM Video rev 0x04 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) intagp0 at vga1 agp0 at intagp0: aperture at 0xd000, size 0x1000 inteldrm0 at vga1: apic 1 int 16 (irq 11) drm0 at inteldrm0 Intel 82915GM Video rev 0x04 at pci0 dev 2 function 1 not configured azalia0 at pci0 dev 27 function 0 Intel 82801FB HD Audio rev 0x04: apic 1 int 16 (irq 11) azalia0: codecs: Realtek ALC662 audio0 at azalia0