Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
Hi there, I'm scratching my head. I've just migrated to a super micro chassis and at the same time gone from FreeBSD 9.0 to 9.1-RELEASE. The machine in question is running a ZFS mirror configuration on two ada devices (with a 8gb gmirror carved out for swap). Since doing so I've been having strange drop outs on the drives; the just disappear from the bus like so: (ada2:ahcich2:0:0:0): removing device entry (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted At first I though it was a failing drive - one of the drives did this, and I limped on a single drive for a week until I could get someone up to the rack to plug a third drive in. We resilvered the zpool onto the new device and ran with the failed drive still plugged in (but not responding to a reset on the ada bus with camcontrol) for a week or so. Then, the new drive dropped out in exactly the same way, followed in short order by the remaining original drive!!! After rebooting the machine, and observing all three drives probing and available, I resilvered the gmirror and zpool again on the two devices expected that I thought were reliable, but before the resilvering was completed the new drive dropped out again. I'm scratching my head now. I can't imagine that it's a wiring problem, as they are all on individual SATA buses and individually cabled. Smart isn't reporting an drive issues either…. :/ So, I'm wondering, is it a driver issuer with 9.1-RELEASE, if I upgrade to 9-RELENG would I expect that to resolve the problem? (Have there been any reported ada bus issuer reported since last December?) The hardware in question is: ahci0: Intel Cougar Point AHCI SATA controller port 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f mem 0xdfb02000-0xdfb027ff irq 19 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported ahcich0: AHCI channel at channel 0 on ahci0 ahcich1: AHCI channel at channel 1 on ahci0 ahcich2: AHCI channel at channel 2 on ahci0 ahcich3: AHCI channel at channel 3 on ahci0 ahcich4: AHCI channel at channel 4 on ahci0 ahcich5: AHCI channel at channel 5 on ahci0 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 ada2: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada2: Previously was known as ad8 Any ideas would be greatly welcomed. Thanks, Joe ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
What chassis is this? - Original Message - From: Dr Josef Karthauser j...@karthauser.co.uk To: freebsd...@freebsd.org Cc: freebsd-stable@freebsd.org Sent: Thursday, July 18, 2013 8:29 AM Subject: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue? Hi there, I'm scratching my head. I've just migrated to a super micro chassis and at the same time gone from FreeBSD 9.0 to 9.1-RELEASE. The machine in question is running a ZFS mirror configuration on two ada devices (with a 8gb gmirror carved out for swap). Since doing so I've been having strange drop outs on the drives; the just disappear from the bus like so: (ada2:ahcich2:0:0:0): removing device entry (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted At first I though it was a failing drive - one of the drives did this, and I limped on a single drive for a week until I could get someone up to the rack to plug a third drive in. We resilvered the zpool onto the new device and ran with the failed drive still plugged in (but not responding to a reset on the ada bus with camcontrol) for a week or so. Then, the new drive dropped out in exactly the same way, followed in short order by the remaining original drive!!! After rebooting the machine, and observing all three drives probing and available, I resilvered the gmirror and zpool again on the two devices expected that I thought were reliable, but before the resilvering was completed the new drive dropped out again. I'm scratching my head now. I can't imagine that it's a wiring problem, as they are all on individual SATA buses and individually cabled. Smart isn't reporting an drive issues either…. :/ So, I'm wondering, is it a driver issuer with 9.1-RELEASE, if I upgrade to 9-RELENG would I expect that to resolve the problem? (Have there been any reported ada bus issuer reported since last December?) The hardware in question is: ahci0: Intel Cougar Point AHCI SATA controller port 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f mem 0xdfb02000-0xdfb027ff irq 19 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported ahcich0: AHCI channel at channel 0 on ahci0 ahcich1: AHCI channel at channel 1 on ahci0 ahcich2: AHCI channel at channel 2 on ahci0 ahcich3: AHCI channel at channel 3 on ahci0 ahcich4: AHCI channel at channel 4 on ahci0 ahcich5: AHCI channel at channel 5 on ahci0 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 ada2: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada2: Previously was known as ad8 Any ideas would be greatly welcomed. Thanks, Joe ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
Is the SSD running at 6Gbps? If so have you tried limiting the speed to 3Gbps? Regards Steve - Original Message - From: John Reynolds john...@reynoldsnet.org To: sta...@freebsd.org Sent: Thursday, July 18, 2013 1:45 AM Subject: experience with 9.2-PRERELEASE Hello all, I have some feedback for the recently prepared snapshot of 9.2-RELEASE. I've been trying like crazy to get the 9x series code installed on a brand new workstation I'm building. It consists of a brand new ASRock motherboard and haswell i7-4770k processor, Z87 chipset. I tried at first to install 9.1-RELEASE. it worked (after I learned a bit about the new installer--I have used FreeBSD forever but have been out of it in terms of this list and other going's on for about a year due to workload), but upon booting I was getting timeout errors ahcich0: Timeout on slot xx port 0 ahcich0: is 0 cs ss e000 rs e000 tfd 40 serr cmd 0004df17 to the SSD that I used as the primary hard drive. So, then I figured I would try a more recent snapshot hoping that something had been spotted and fixed already. I got the 9.2-PRERELEASE amd64 snapshot and tried to install it. However, I couldn't even get past the first screen of the install because of these messages: ugen0.2: (Unknown) at usbus0 (disconnected) uhub_reattach_port: could not allocate new device and the keyboard was non-functional. It just sat there spewing these errors about 1 per second. So, even though I'm having some sort of ahci timeout issue with this hardware and 9.1-RELEASE, it certainly appears that something has zig-zagged in the usb stack in this 9.2-PRELEASE snapshot because I can't even get through the install program. :(. Same hardware, same everything. I also tried the 9.2-PRELEASE i386 snapshot for giggles. Same result. The final blow to my sanity was when I rolled back and tried to install 8.4-RELEASE and sysinstall couldn't make the disk devices after I hit 'W' in the disk label editor to commit my changes. So, I'm wondering generically are people having problems with SSD's in FreeBSD? In this hardware I have not tried using a traditional SATA disk yet. But just thought I would report that 9.2-PRERELEASE can't even get through install on this H/W where 9.1 could. -Jr ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [SOLVED] Re: Shutdown problem with an USB memory stick as ZFS cache device
On 17/07/2013 17:29, Julian H. Stacey wrote: Maurizio Vairani wrote: On 17/07/2013 11:50, Ronald Klop wrote: On Wed, 17 Jul 2013 10:27:09 +0200, Maurizio Vairani maurizio.vair...@cloverinformatica.it wrote: Hi all, on a Compaq Presario laptop I have just installed the latest stable #uname -a FreeBSD presario 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0: Tue Jul 16 16:32:39 CEST 2013 root@presario:/usr/obj/usr/src/sys/GENERIC amd64 For speed up the compilation I have added to the pool, tank0, a SanDisk memory stick as cache device with the command: # zpool add tank0 cache /dev/da0 But when I shutdown the laptop the process will halt with this screen shot: http://www.dump-it.fr/freebsd-screen-shot/2f9169f18c7c77e52e873580f9c2d4bf.jpg.html and I need to press the power button for more than 4 seconds to switch off the laptop. The problem is always reproducible. Does sysctl hw.usb.no_shutdown_wait=1 help? Ronald. Thank you Ronald it works ! In /boot/loader.conf added the line hw.usb.no_shutdown_wait=1 Maurizio I wonder (from ignorance as I dont use ZFS yet), if that merely masks the symptom or cures the fault ? Presumably one should use a ZFS command to disassociate whatever might have the cache open ? (in case something might need to be written out from cache, if it was a writeable cache ?) I too had a USB shutdown problem (non ZFS, now solved) several people made useful comments on shutdown scripts etc, so I'm cross referencing: http://lists.freebsd.org/pipermail/freebsd-mobile/2013-July/012803.html Cheers, Julian Probably it masks the symptom. Andriy Gapon hypothesizes a bug in the ZFS clean up code: http://lists.freebsd.org/pipermail/freebsd-fs/2013-July/017857.html Surely one can use a startup script with the command: zpool add tank0 cache /dev/da0 and a shutdown script with: zpool remove tank0 /dev/da0 but this mask the symptom too. I prefer the Ronald solution because: - is simpler: it adds only one line (hw.usb.no_shutdown_wait=1) to one file (/boot/loader.conf). - is fastest: the zpool add/remove commands take time and “hw.usb.no_shutdown_wait=1” in /boot/loader.conf speeds up the shutdown process. - is cleaner: the zpool add/remove commands pair will fill up the tank0 pool history. Regards Maurizio ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
Hi, On 18 Jul 2013, at 08:29, Dr Josef Karthauser wrote: Hi there, I'm scratching my head. I've just migrated to a super micro chassis and at the same time gone from FreeBSD 9.0 to 9.1-RELEASE. The machine in question is running a ZFS mirror configuration on two ada devices (with a 8gb gmirror carved out for swap). Since doing so I've been having strange drop outs on the drives; the just disappear from the bus like so: (ada2:ahcich2:0:0:0): removing device entry (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted At first I though it was a failing drive - one of the drives did this, and I limped on a single drive for a week until I could get someone up to the rack to plug a third drive in. We resilvered the zpool onto the new device and ran with the failed drive still plugged in (but not responding to a reset on the ada bus with camcontrol) for a week or so. Then, the new drive dropped out in exactly the same way, followed in short order by the remaining original drive!!! After rebooting the machine, and observing all three drives probing and available, I resilvered the gmirror and zpool again on the two devices expected that I thought were reliable, but before the resilvering was completed the new drive dropped out again. I'm scratching my head now. I can't imagine that it's a wiring problem, as they are all on individual SATA buses and individually cabled. Smart isn't reporting an drive issues either…. :/ So, I'm wondering, is it a driver issuer with 9.1-RELEASE, if I upgrade to 9-RELENG would I expect that to resolve the problem? (Have there been any reported ada bus issuer reported since last December?) The hardware in question is: ahci0: Intel Cougar Point AHCI SATA controller port 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f mem 0xdfb02000-0xdfb027ff irq 19 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported ahcich0: AHCI channel at channel 0 on ahci0 ahcich1: AHCI channel at channel 1 on ahci0 ahcich2: AHCI channel at channel 2 on ahci0 ahcich3: AHCI channel at channel 3 on ahci0 ahcich4: AHCI channel at channel 4 on ahci0 ahcich5: AHCI channel at channel 5 on ahci0 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 ada2: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada2: Previously was known as ad8 Any ideas would be greatly welcomed. Thanks, Joe Me too (over a long period, with various hardware). There is a general problem with energy-saving drives that controllers don't understand them. Typically the drive decides to go into some power-saving mode, the controller wants to do some operation, the drive takes too long to come ready, the controller decides the drive has gone away. You have to persuade the controller to wait longer for the drive to come ready, and/or persuade the drive to stay awake. This isn't necessarily easy, eg the controller's ready wait may not be programmable. (Or avoid such drives like the plague, life's too short). -- Bob Bishop r...@gid.co.uk ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Seeing data corruption with g_multipath utility
Hi Steven, Read/Write errors are recorded when an active path of the geom_multipath device is pulled while running the i/o on dataset created for the pool. Running I/o on dataset using dd. *Freebsd version :* 9.0 *Patch imported from stable 9 : *r229303, r234916 *zpool status:* ** pool: poola state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM poola ONLINE 0 0 0 mirror-0ONLINE 0 0 0 multipath/newdisk4 ONLINE 0 0 0 multipath/newdisk2 ONLINE 0 00 errors: No known data errors * * * * *gmultipath status:* * * Name Status Components multipath/newdisk2 OPTIMAL da7 (ACTIVE) da2 (PASSIVE) multipath/newdisk1 OPTIMAL da6 (ACTIVE) da1 (PASSIVE) multipath/newdisk4 OPTIMAL da3 (ACTIVE) da4 (PASSIVE) multipath/newdisk OPTIMAL da0 (ACTIVE) da5 (PASSIVE) multipath/newdisk3 OPTIMAL da8 (ACTIVE) da9 (PASSIVE) * * *zpool status after pulling the active path g_multipath device*: pool: mypool1 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scan: resilvered 27.2M in 0h0m with 0 errors on Thu Jul 4 19:47:44 2013 config: NAMESTATE READ WRITE CKSUM mypool1 ONLINE 0 0 0 mirror-0 ONLINE 012 0 multipath/newdisk4 ONLINE 027 0 multipath/newdisk2 ONLINE 012 0 spares multipath/newdisk AVAIL errors: No known data errors Are there any dependencies for the patch that is picked from stable 9 as mentioned above?? sorry for posting it in many areas of the forum its just to find the rite fit. -- Thanks Regards, Sowmya L ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 07/18/13 10:25, Bob Bishop wrote: Me too (over a long period, with various hardware). There is a general problem with energy-saving drives that controllers don't understand them. Typically the drive decides to go into some power-saving mode, the controller wants to do some operation, the drive takes too long to come ready, the controller decides the drive has gone away. You have to persuade the controller to wait longer for the drive to come ready, and/or persuade the drive to stay awake. This isn't necessarily easy, eg the controller's ready wait may not be programmable. (Or avoid such drives like the plague, life's too short). Perhaps they are WD Green drives? In that case, other than quoting Bob's suggestion about avoiding them, there's something you can do: a) turn off the drives' power-saving features (this is done through a DOS utility you can download); b) try different controllers and/or different OS releases. You'll find a lot on this problem if you search the web. There's also a report of mine you can search on this ML, regarding FreeBSD specifically. HTH. bye av. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Help with filing a [maybe] ZFS/mmap bug.
on 17/07/2013 23:47 George Hartzell said the following: How should I move forward with this? Could you please try to reproduce this problem using a kernel built with INVARIANTS options? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
On 7/18/2013 12:36 AM, Steven Hartland wrote: Is the SSD running at 6Gbps? If so have you tried limiting the speed to 3Gbps? I would imagine so. It has a 6Gbps interface and the Z87 board does also--so I can only imagine it's trying to go as fast as possible by default. I will fiddle with the BIOS and see if I can limit it and see if that prevents the timeout problems I was having after the 9.1-R install. Somebody else suggested changing the USB settings in the BIOS to try and overcome the problem when using the snap of 9.2-PRERELEASE as well so I will try that too. Ultimately if nothing likes the SSD I will try a regular SATA drive as last resort. The drive is an Intel 520 Series 240Gb model, FWIW. Thanks! -Jr ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
On Wed, Jul 17, 2013 at 08:49:39PM -0700, John Reynolds wrote: On 7/17/2013 5:48 PM, Glen Barber wrote: Hi John, Do you have a SATA drive you can try with this hardware? It would be useful to know if that works, or same errors, etc. Glen I do, and that was my plan for tomorrow. A bit under the weather today. I will most definitely report back any findings. Thanks for your reply! John, in addition to suggestions/replies from others, can you also try the 10.0-CURRENT snapshot? In particular, if your problem continues with a SATA drive, I am curious if the problem still exists in head/. http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/ Thanks. Glen pgpiuCY4I5SKv.pgp Description: PGP signature
Re: experience with 9.2-PRERELEASE
You can use: hint.ahcich.X.sata_rev to limit the speed See man for ahci(4) for more details. - Original Message - From: John Reynolds john...@reynoldsnet.org To: Steven Hartland kill...@multiplay.co.uk Cc: sta...@freebsd.org Sent: Thursday, July 18, 2013 4:48 PM Subject: Re: experience with 9.2-PRERELEASE On 7/18/2013 12:36 AM, Steven Hartland wrote: Is the SSD running at 6Gbps? If so have you tried limiting the speed to 3Gbps? I would imagine so. It has a 6Gbps interface and the Z87 board does also--so I can only imagine it's trying to go as fast as possible by default. I will fiddle with the BIOS and see if I can limit it and see if that prevents the timeout problems I was having after the 9.1-R install. Somebody else suggested changing the USB settings in the BIOS to try and overcome the problem when using the snap of 9.2-PRERELEASE as well so I will try that too. Ultimately if nothing likes the SSD I will try a regular SATA drive as last resort. The drive is an Intel 520 Series 240Gb model, FWIW. Thanks! -Jr This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
John, in addition to suggestions/replies from others, can you also try the 10.0-CURRENT snapshot? In particular, if your problem continues with a SATA drive, I am curious if the problem still exists in head/. http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/ Thanks. Yes, I can try that. I will try a basic install with -current and see if a) the USB problem exists b) if I am getting the timeout errors with the SSD before I go off and tweak speed settings / USB settings. -jr ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
On Thu, Jul 18, 2013 at 08:48:28AM -0700, John Reynolds wrote: On 7/18/2013 12:36 AM, Steven Hartland wrote: Is the SSD running at 6Gbps? If so have you tried limiting the speed to 3Gbps? I would imagine so. It has a 6Gbps interface and the Z87 board does also--so I can only imagine it's trying to go as fast as possible by default. I will fiddle with the BIOS and see if I can limit it and see if that prevents the timeout problems I was having after the 9.1-R install. Somebody else suggested changing the USB settings in the BIOS to try and overcome the problem when using the snap of 9.2-PRERELEASE as well so I will try that too. Ultimately if nothing likes the SSD I will try a regular SATA drive as last resort. The drive is an Intel 520 Series 240Gb model, FWIW. FWIW, I had one of these in my laptop for about a year, and can confirm that it should work fine, at least on 10.0-CURRENT. Glen pgp945L_tmvIq.pgp Description: PGP signature
Re: Help with filing a [maybe] ZFS/mmap bug.
Andriy Gapon writes: on 17/07/2013 23:47 George Hartzell said the following: How should I move forward with this? Could you please try to reproduce this problem using a kernel built with INVARIANTS options? I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC kernel, rebuilt it, installed it and running through my test case generated a lot of invalid flac files. Im not sure what the options are/were supposed to do though, it looks like they generally lead to KASSERTS, which lead to abort()'s. Nothing in /var/log/messages or on the console. g. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Help with filing a [maybe] ZFS/mmap bug.
Richard Todd writes: George Hartzell hartz...@alerce.com writes: Hi All, I have what I think is a ZFS related bug. [...] [summary: Picard seems to trigger an mmap consistency bug in ZFS]. [...] Anyway, what I'd suggest is the following: see if my patch for py-mutagen disabling the mmap() in those two functions lets you run picard reliably. Removing the mmap support from those two routines seems to avoid the issue. If so, then the issue is triggered by one or both of those two routines; hack them to print out the exact offsets used on each call and use that to try and code up a simple C++ test case. [...] Your test case doesn't use mmap, I assume that you've offered it up as a hint, not as something that's nearly done. The shell script in particular seems useful. In my case I'd want to find a particular set of file size, offset, and insertion size that triggers the problem and code up a c/c++ equiv. of the mmap calls that py-mutagen does. Right? I'm hesistant about that. I believe (and will try to prove) that the problem does not occur deterministically for a particular track between different test runs. I'm worried that it's not as simple as using mmap to insert 27 bytes into a 1024 bytes file at pos 42 causes corruption but rather that it depends on a more complex set of interactions. My next step will be to see if a track that has trouble in one run has trouble in another. If not, then I'm not sure that a simple test will be successful. g. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Help with filing a [maybe] ZFS/mmap bug.
I know how all not loving me-too emails, but I'll try :) There's a rtorrent, which uses mmap. And I had cases (related to reboot), where big files (or average files in many-files torrents) appears with broken checksum without any good reason. Author of rtorrent not very politely always assume that really broken other filesystems, while rtorrent have simple logic (with mmap using)... This post is pretty interesting: http://libtorrent.rakshasa.no/ticket/483 In the end I just switched to transmission. Anyway, there could be really some weird/rare bugs with ZFS and mmap. Or just ZFS. I hope this will help at least to narrow direction for potential bugbusting. 2013/7/18 George Hartzell hartz...@alerce.com Richard Todd writes: George Hartzell hartz...@alerce.com writes: Hi All, I have what I think is a ZFS related bug. [...] [summary: Picard seems to trigger an mmap consistency bug in ZFS]. [...] Anyway, what I'd suggest is the following: see if my patch for py-mutagen disabling the mmap() in those two functions lets you run picard reliably. Removing the mmap support from those two routines seems to avoid the issue. If so, then the issue is triggered by one or both of those two routines; hack them to print out the exact offsets used on each call and use that to try and code up a simple C++ test case. [...] Your test case doesn't use mmap, I assume that you've offered it up as a hint, not as something that's nearly done. The shell script in particular seems useful. In my case I'd want to find a particular set of file size, offset, and insertion size that triggers the problem and code up a c/c++ equiv. of the mmap calls that py-mutagen does. Right? I'm hesistant about that. I believe (and will try to prove) that the problem does not occur deterministically for a particular track between different test runs. I'm worried that it's not as simple as using mmap to insert 27 bytes into a 1024 bytes file at pos 42 causes corruption but rather that it depends on a more complex set of interactions. My next step will be to see if a track that has trouble in one run has trouble in another. If not, then I'm not sure that a simple test will be successful. g. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org -- Regards, Alexander Yerenkow ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote: Perhaps they are WD Green drives? They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green or not. In that case, other than quoting Bob's suggestion about avoiding them, there's something you can do: a) turn off the drives' power-saving features (this is done through a DOS utility you can download); b) try different controllers and/or different OS releases. I'm committed to FreeBSD, as the machine is already rolled out and in a data centre ;). You'll find a lot on this problem if you search the web. There's also a report of mine you can search on this ML, regarding FreeBSD specifically. I'll see if I can find it. Thanks. Joe ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 18 Jul 2013, at 08:33, Steven Hartland kill...@multiplay.co.uk wrote: What chassis is this? Hey Steven, It's a Supermicro CSE-813MTQ-350CB. Cheers, Joe ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Help with filing a [maybe] ZFS/mmap bug.
On Thu, Jul 18, 2013 at 11:40:51AM -0700, George Hartzell wrote: Removing the mmap support from those two routines seems to avoid the issue. Aha. If so, then the issue is triggered by one or both of those two routines; hack them to print out the exact offsets used on each call and use that to try and code up a simple C++ test case. [...] Your test case doesn't use mmap, I assume that you've offered it up as a hint, not as something that's nearly done. The shell script in particular seems useful. Um, go look at gen4.cpp again. It uses mmap(). The insert_bytes and delete_bytes functions should work the same way as the (mmap-using path of) the functions of the same name in py-mutagen. In my case I'd want to find a particular set of file size, offset, and insertion size that triggers the problem and code up a c/c++ equiv. of the mmap calls that py-mutagen does. Right? Yeah. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
Hi-- On Jul 18, 2013, at 12:13 PM, Dr Josef Karthauser j...@karthauser.co.uk wrote: On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote: Perhaps they are WD Green drives? They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green or not. Yes, those are WDC's Green drives, although they are also the higher grade version as compared to standard desktop drives which are supposed to have firmware which plays nice with RAID (TLER, time-limited error recovery). Updating the firmware and increasing the timeout before these spin down automagically is likely to help, but as Andrea noted, such drives do have quite a history of timeout problems due to excessive head parking and their power conservation attempts. Regards, -- -Chuck ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 18 Jul 2013, at 20:31, Charles Swiger cswi...@mac.com wrote: Hi-- On Jul 18, 2013, at 12:13 PM, Dr Josef Karthauser j...@karthauser.co.uk wrote: On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote: Perhaps they are WD Green drives? They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green or not. Yes, those are WDC's Green drives, although they are also the higher grade version as compared to standard desktop drives which are supposed to have firmware which plays nice with RAID (TLER, time-limited error recovery). Updating the firmware and increasing the timeout before these spin down automagically is likely to help, but as Andrea noted, such drives do have quite a history of timeout problems due to excessive head parking and their power conservation attempts. We also wondered whether it was the motherboard, and so we've replaced it! Hope that that works! But, from what's being said here, it looks like that might not be the case. :/ Although, we've been up for 5 days now with no recurrences of the previous issue. Joe ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 18 Jul 2013, at 20:31, Charles Swiger cswi...@mac.com wrote: On Jul 18, 2013, at 12:13 PM, Dr Josef Karthauser j...@karthauser.co.uk wrote: On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote: Perhaps they are WD Green drives? They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green or not. Yes, those are WDC's Green drives, although they are also the higher grade version as compared to standard desktop drives which are supposed to have firmware which plays nice with RAID (TLER, time-limited error recovery). Updating the firmware and increasing the timeout before these spin down automagically is likely to help, but as Andrea noted, such drives do have quite a history of timeout problems due to excessive head parking and their power conservation attempts. They're currently on firmware 02.01B01, btw. Not sure if that's the latest or not. Joe ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
- Original Message - From: Dr Josef Karthauser j...@karthauser.co.uk On 18 Jul 2013, at 08:33, Steven Hartland kill...@multiplay.co.uk wrote: What chassis is this? Hey Steven, It's a Supermicro CSE-813MTQ-350CB. We've seen issues on supermicro chassis before which cause timeouts and in extreme cases device drops so if you can try wiring the disks up directly to the MB via sata cables bypassing the hotswap midplane and see if that helps. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 07/18/13 21:13, Dr Josef Karthauser wrote: b) try different controllers and/or different OS releases. I'm committed to FreeBSD, as the machine is already rolled out and in a data centre ;). I said different OS releases, not different OS! I wouln't say such a blasphemy :) bye av. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?
On 07/18/13 21:31, Charles Swiger wrote: Updating the firmware and increasing the timeout before these spin down automagically is likely to help, but as Andrea noted, such drives do have quite a history of timeout problems due to excessive head parking and their power conservation attempts. Just for the record, I've been using them for several months without a hitch; it's just a matter of finding the correct settings/firmware/OS version/controller. This is to say you should be able to get them to work, altough you might require some luck (or some sort of divination). bye av. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
On 7/18/2013 8:49 AM, Glen Barber wrote: On Wed, Jul 17, 2013 at 08:49:39PM -0700, John Reynolds wrote: today. I will most definitely report back any findings. Thanks for your reply! John, in addition to suggestions/replies from others, can you also try the 10.0-CURRENT snapshot? In particular, if your problem continues with a SATA drive, I am curious if the problem still exists in head/. http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/ Thanks. Thanks to Glen and Steve and all those who replied to my initial posting. I am quite pleased after having been figuratively gone from (but still running) FreeBSD for coming up on 2 years (hey--when it's so rock solid, it's easy when you don't have to ask questions :) that i can come back to these lists and still find lots of people willing to help! You guys (and gals) are awesome! Anyway, it seems I have solved my problems with this new hardware. There was really two issues I bumped into. After installing 9.1-R I saw this timeout problem to ahcich1 device and it would hang for about 5 minutes then continue to boot. Then I tried 9.2-PRE but there the keyboard was not recognized and I kept getting infinite ugen0.2: Unknown . (disconnected) errors at the install screen. One person said to fiddle in the BIOS with the USB settings. I went into the BIOS and disabled Intel USB 3.0 Mode support (this is a very new motherboard with USB 3.0). That fixed the issue! For those curious about -current, I also tried to install that and it experienced the same USB problems until I disabled this 3.0 mode. Once I got past that and installed 9.2-PRE again, I let it timeout again during boot. Then I looked at the dmesg output much more closely than I could have during the boot from DVD. It turns out that I only THOUGHT it was a timeout to the SSD. The Intel Series 520 SSD was NOT the issue. I was getting a timeout on ahcich1 which from the dmesg output was my DVD burner! The SSD was on ahcich5. The only difference here is that by sheer happenstance, I connected the SSD to the SATA port owned by the Intel Z87 (Lynx Point) controller. The DVD burner (which is probably 5 years old) was hooked to ASRock's own ASMedia ASM1061 controller. Apparently this controller doesn't play well with my older DVD burner. I switched ports and put both on the Intel controller and BOOM! Success! No timeouts, no funny business. Nothing. At this point i don't mind at all turning off the USB 3.0 mode because I have yet to actually see a 3.0 device in the wild and I certainly won't own one anytime soon. :) so it's kind of a moot point. I'm sure the USB developers will see this problem and tackle it in time. I'm just thrilled to be booting on this new H/W! Thanks again to all on the list that replied! -Jr ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: experience with 9.2-PRERELEASE
On Thu, Jul 18, 2013 at 01:28:10PM -0700, John Reynolds wrote: On 7/18/2013 8:49 AM, Glen Barber wrote: On Wed, Jul 17, 2013 at 08:49:39PM -0700, John Reynolds wrote: today. I will most definitely report back any findings. Thanks for your reply! John, in addition to suggestions/replies from others, can you also try the 10.0-CURRENT snapshot? In particular, if your problem continues with a SATA drive, I am curious if the problem still exists in head/. http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/ Thanks. Thanks to Glen and Steve and all those who replied to my initial posting. I am quite pleased after having been figuratively gone from (but still running) FreeBSD for coming up on 2 years (hey--when it's so rock solid, it's easy when you don't have to ask questions :) that i can come back to these lists and still find lots of people willing to help! You guys (and gals) are awesome! Anyway, it seems I have solved my problems with this new hardware. There was really two issues I bumped into. After installing 9.1-R I saw this timeout problem to ahcich1 device and it would hang for about 5 minutes then continue to boot. Then I tried 9.2-PRE but there the keyboard was not recognized and I kept getting infinite ugen0.2: Unknown . (disconnected) errors at the install screen. One person said to fiddle in the BIOS with the USB settings. I went into the BIOS and disabled Intel USB 3.0 Mode support (this is a very new motherboard with USB 3.0). That fixed the issue! For those curious about -current, I also tried to install that and it experienced the same USB problems until I disabled this 3.0 mode. Once I got past that and installed 9.2-PRE again, I let it timeout again during boot. Then I looked at the dmesg output much more closely than I could have during the boot from DVD. It turns out that I only THOUGHT it was a timeout to the SSD. The Intel Series 520 SSD was NOT the issue. I was getting a timeout on ahcich1 which from the dmesg output was my DVD burner! The SSD was on ahcich5. The only difference here is that by sheer happenstance, I connected the SSD to the SATA port owned by the Intel Z87 (Lynx Point) controller. The DVD burner (which is probably 5 years old) was hooked to ASRock's own ASMedia ASM1061 controller. Apparently this controller doesn't play well with my older DVD burner. I switched ports and put both on the Intel controller and BOOM! Success! No timeouts, no funny business. Nothing. At this point i don't mind at all turning off the USB 3.0 mode because I have yet to actually see a 3.0 device in the wild and I certainly won't own one anytime soon. :) so it's kind of a moot point. I'm sure the USB developers will see this problem and tackle it in time. I'm just thrilled to be booting on this new H/W! Thanks again to all on the list that replied! Great to hear. Thank you for trying different suggestions, and reporting back the results. I'm glad things worked out (though, I'm still a bit worried about the xhci(4) in your case...). Glen pgpvzD9_AJom3.pgp Description: PGP signature
Re: experience with 9.2-PRERELEASE
On Thu, Jul 18, 2013 at 3:28 PM, John Reynolds john...@reynoldsnet.orgwrote: One person said to fiddle in the BIOS with the USB settings. I went into the BIOS and disabled Intel USB 3.0 Mode support (this is a very new motherboard with USB 3.0). That fixed the issue! For those curious about -current, I also tried to install that and it experienced the same USB problems until I disabled this 3.0 mode. This is worth pursuing on freebsdf-usb@ or by filing a PR. Only way it gets fixed is if the right people know about it. -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
9.2PRERELEASE ZFS panic in lzjb_compress
Hi, Running 9.2-PRERELEASE #19 r253313 I got the following panic Fatal trap 12: page fault while in kernel mode cpuid = 22; apic id = 46 fault virtual address = 0xff827ebca30c fault code = supervisor read data, page not present instruction pointer = 0x20:0x81983055 stack pointer = 0x28:0xffcf75bd60a0 frame pointer = 0x28:0xffcf75bd68f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 (zio_write_issue_hig) trap number = 12 panic: page fault cpuid = 22 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame 0xffcf75bd5b30 kdb_backtrace() at kdb_backtrace+0x37/frame 0xffcf75bd5bf0 panic() at panic+0x1ce/frame 0xffcf75bd5cf0 trap_fatal() at trap_fatal+0x290/frame 0xffcf75bd5d50 trap_pfault() at trap_pfault+0x211/frame 0xffcf75bd5de0 trap() at trap+0x344/frame 0xffcf75bd5fe0 calltrap() at calltrap+0x8/frame 0xffcf75bd5fe0 --- trap 0xc, rip = 0x81983055, rsp = 0xffcf75bd60a0, rbp = 0xffcf75bd68f0 --- lzjb_compress() at lzjb_compress+0x185/frame 0xffcf75bd68f0 zio_compress_data() at zio_compress_data+0x92/frame 0xffcf75bd6920 zio_write_bp_init() at zio_write_bp_init+0x24b/frame 0xffcf75bd6970 zio_execute() at zio_execute+0xc3/frame 0xffcf75bd69b0 taskqueue_run_locked() at taskqueue_run_locked+0x74/frame 0xffcf75bd6a00 taskqueue_thread_loop() at taskqueue_thread_loop+0x46/frame 0xffcf75bd6a20 fork_exit() at fork_exit+0x11f/frame 0xffcf75bd6a70 fork_trampoline() at fork_trampoline+0xe/frame 0xffcf75bd6a70 --- trap 0, rip = 0, rsp = 0xffcf75bd6b30, rbp = 0 --- lzjb_compress+0x185 corresponds to line 85 in 80 cpy = src - offset; 81 if (cpy = (uchar_t *)s_start cpy != src 82src[0] == cpy[0] src[1] == cpy[1] src[2] == cpy[2]) { 83 *copymap |= copymask; 84 for (mlen = MATCH_MIN; mlen MATCH_MAX; mlen++) 85 if (src[mlen] != cpy[mlen]) 86 break; 87 *dst++ = ((mlen - MATCH_MIN) (NBBY - MATCH_BITS)) | 88(offset NBBY); 89 *dst++ = (uchar_t)offset; I think it's the first time I've seen this panic. It happened while doing a send/receive. I have two pools with lzjb compression; I don't know which of these pools caused the problem, but one of them was the source of the send/receive. I only have a textdump but I'm happy to try to provide more information that could help anyone look into this. Thanks Olivier ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote: OK, patch is applied. I will reboot the machine later and see what happens tomorrow in the morning. However, it might take a few days since the last 2 weeks all was fine. BTW, should this patch be used in general or is it just for debugging? My understanding is that it is something which could stay in the code... Patch is to improve debugging. I probably commit it after the issue is closed. Arguments against the commit is that the change imposes small performance penalty due to save and restore of the %ebp (I doubt that this is measureable by any means). Also, arguably, such change should be done for all functions in support.s, but bcopy() is the hot spot. Thanks to this patch, we (you ;-) were able to track down the problem. So how are we going to deal with this debugging patch itself? My suggestion would be to #ifdef it somehow so on one hand it will be availabe in future (and with bcopy being used a lot probability is high it might help in other places), on the other hand it won't steal cycles during normal use. -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org