Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Dr Josef Karthauser
Hi there,

I'm scratching my head. I've just migrated to a super micro chassis and at the 
same time gone from FreeBSD 9.0 to 9.1-RELEASE.

The machine in question is running a ZFS mirror configuration on two ada 
devices (with a 8gb gmirror carved out for swap).

Since doing so I've been having strange drop outs on the drives; the just 
disappear from the bus like so:

(ada2:ahcich2:0:0:0): removing device entry
(aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error
(aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff
(aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted
(aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error
(aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff
(aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted


At first I though it was a failing drive - one of the drives did this, and I 
limped on a single drive for a week until I could get someone up to the rack to 
plug a third drive in.  We resilvered the zpool onto the new device and ran 
with the failed drive still plugged in (but not responding to a reset on the 
ada bus with camcontrol) for a week or so.

Then, the new drive dropped out in exactly the same way, followed in short 
order by the remaining original drive!!!

After rebooting the machine, and observing all three drives probing and 
available, I resilvered the gmirror and zpool again on the two devices expected 
that I thought were reliable, but before the resilvering was completed the new 
drive dropped out again.

I'm scratching my head now. I can't imagine that it's a wiring problem, as they 
are all on individual SATA buses and individually cabled.

Smart isn't reporting an drive issues either…. :/

So, I'm wondering, is it a driver issuer with 9.1-RELEASE, if I upgrade to 
9-RELENG would I expect that to resolve the problem?  (Have there been any 
reported ada bus issuer reported since last December?)

The hardware in question is:

ahci0: Intel Cougar Point AHCI SATA controller port 
0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f mem 
0xdfb02000-0xdfb027ff irq 19 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
ahcich0: AHCI channel at channel 0 on ahci0
ahcich1: AHCI channel at channel 1 on ahci0
ahcich2: AHCI channel at channel 2 on ahci0
ahcich3: AHCI channel at channel 3 on ahci0
ahcich4: AHCI channel at channel 4 on ahci0
ahcich5: AHCI channel at channel 5 on ahci0
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad6
ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
ada2: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad8


Any ideas would be greatly welcomed.

Thanks,
Joe

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Steven Hartland

What chassis is this?

- Original Message - 
From: Dr Josef Karthauser j...@karthauser.co.uk

To: freebsd...@freebsd.org
Cc: freebsd-stable@freebsd.org
Sent: Thursday, July 18, 2013 8:29 AM
Subject: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?


Hi there,

I'm scratching my head. I've just migrated to a super micro chassis and at the 
same time gone from FreeBSD 9.0 to 9.1-RELEASE.

The machine in question is running a ZFS mirror configuration on two ada 
devices (with a 8gb gmirror carved out for swap).

Since doing so I've been having strange drop outs on the drives; the just 
disappear from the bus like so:

(ada2:ahcich2:0:0:0): removing device entry
(aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error
(aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff
(aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted
(aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error
(aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff
(aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted


At first I though it was a failing drive - one of the drives did this, and I limped on a single drive for a week until I could get 
someone up to the rack to plug a third drive in.  We resilvered the zpool onto the new device and ran with the failed drive still 
plugged in (but not responding to a reset on the ada bus with camcontrol) for a week or so.


Then, the new drive dropped out in exactly the same way, followed in short 
order by the remaining original drive!!!

After rebooting the machine, and observing all three drives probing and available, I resilvered the gmirror and zpool again on the 
two devices expected that I thought were reliable, but before the resilvering was completed the new drive dropped out again.


I'm scratching my head now. I can't imagine that it's a wiring problem, as they are all on individual SATA buses and individually 
cabled.


Smart isn't reporting an drive issues either…. :/

So, I'm wondering, is it a driver issuer with 9.1-RELEASE, if I upgrade to 9-RELENG would I expect that to resolve the problem? 
(Have there been any reported ada bus issuer reported since last December?)


The hardware in question is:

ahci0: Intel Cougar Point AHCI SATA controller port 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f mem 
0xdfb02000-0xdfb027ff irq 19 at device 31.2 on pci0

ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
ahcich0: AHCI channel at channel 0 on ahci0
ahcich1: AHCI channel at channel 1 on ahci0
ahcich2: AHCI channel at channel 2 on ahci0
ahcich3: AHCI channel at channel 3 on ahci0
ahcich4: AHCI channel at channel 4 on ahci0
ahcich5: AHCI channel at channel 5 on ahci0
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad6
ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
ada2: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad8


Any ideas would be greatly welcomed.

Thanks,
Joe

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread Steven Hartland

Is the SSD running at 6Gbps? If so have you tried limiting the speed
to 3Gbps?

   Regards
   Steve
- Original Message - 
From: John Reynolds john...@reynoldsnet.org

To: sta...@freebsd.org
Sent: Thursday, July 18, 2013 1:45 AM
Subject: experience with 9.2-PRERELEASE


Hello all, I have some feedback for the recently prepared snapshot of 
9.2-RELEASE.


I've been trying like crazy to get the 9x series code installed on a 
brand new workstation I'm building. It consists of a brand new ASRock 
motherboard and haswell i7-4770k processor, Z87 chipset. I tried at 
first to install 9.1-RELEASE. it worked (after I learned a bit about the 
new installer--I have used FreeBSD forever but have been out of it 
in terms of this list and other going's on for about a year due to 
workload), but upon booting I was getting timeout errors


ahcich0: Timeout on slot xx port 0
ahcich0: is 0 cs  ss e000 rs e000 tfd 40 serr 
 cmd 0004df17


to the SSD that I used as the primary hard drive. So, then I figured I 
would try a more recent snapshot hoping that something had been spotted 
and fixed already. I got the 9.2-PRERELEASE amd64 snapshot and tried to 
install it. However, I couldn't even get past the first screen of the 
install because of these messages:


ugen0.2: (Unknown) at usbus0 (disconnected)
uhub_reattach_port: could not allocate new device

and the keyboard was non-functional. It just sat there spewing these 
errors about 1 per second.


So, even though I'm having some sort of ahci timeout issue with this 
hardware and 9.1-RELEASE, it certainly appears that something has 
zig-zagged in the usb stack in this 9.2-PRELEASE snapshot because I 
can't even get through the install program. :(. Same hardware, same 
everything.


I also tried the 9.2-PRELEASE i386 snapshot for giggles. Same result.

The final blow to my sanity was when I rolled back and tried to install 
8.4-RELEASE and sysinstall couldn't make the disk devices after I hit 
'W' in the disk label editor to commit my changes. So, I'm wondering 
generically are people having problems with SSD's in FreeBSD? In this 
hardware I have not tried using a traditional SATA disk yet.


But just thought I would report that 9.2-PRERELEASE can't even get 
through install on this H/W where 9.1 could.


-Jr

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [SOLVED] Re: Shutdown problem with an USB memory stick as ZFS cache device

2013-07-18 Thread Maurizio Vairani

On 17/07/2013 17:29, Julian H. Stacey wrote:

Maurizio Vairani wrote:

On 17/07/2013 11:50, Ronald Klop wrote:

On Wed, 17 Jul 2013 10:27:09 +0200, Maurizio Vairani
maurizio.vair...@cloverinformatica.it  wrote:


Hi all,


on a Compaq Presario laptop I have just installed the latest stable


#uname -a

FreeBSD presario 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0: Tue Jul 16
16:32:39 CEST 2013 root@presario:/usr/obj/usr/src/sys/GENERIC  amd64


For speed up the compilation I have added to the pool, tank0,  a
SanDisk memory stick as cache device with the command:


# zpool add tank0 cache /dev/da0


But when I shutdown the laptop the process will halt with this screen
shot:


http://www.dump-it.fr/freebsd-screen-shot/2f9169f18c7c77e52e873580f9c2d4bf.jpg.html



and I need to press the power button for more than 4 seconds to
switch off the laptop.

The problem is always reproducible.

Does sysctl hw.usb.no_shutdown_wait=1 help?

Ronald.

Thank you Ronald it works !

In /boot/loader.conf added the line
hw.usb.no_shutdown_wait=1

Maurizio

I wonder (from ignorance as I dont use ZFS yet),
if that merely masks the symptom or cures the fault ?

Presumably one should use a ZFS command to disassociate whatever
might have the cache open ?  (in case something might need to be
written out from cache, if it was a writeable cache ?)

I too had a USB shutdown problem (non ZFS, now solved)  several people
made useful comments on shutdown scripts etc, so I'm cross referencing:

http://lists.freebsd.org/pipermail/freebsd-mobile/2013-July/012803.html

Cheers,
Julian
Probably it masks the symptom. Andriy Gapon hypothesizes a bug in the 
ZFS clean up code:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-July/017857.html

Surely one can use a startup script with the command:
zpool add tank0 cache /dev/da0
and a shutdown script with:
zpool remove tank0 /dev/da0
but this mask the symptom too.

I prefer the Ronald solution because:
- is simpler: it adds only one line (hw.usb.no_shutdown_wait=1) to one 
file (/boot/loader.conf).
- is fastest: the zpool add/remove commands take time and 
“hw.usb.no_shutdown_wait=1” in /boot/loader.conf speeds up the shutdown 
process.
- is cleaner: the zpool add/remove commands pair will fill up the tank0 
pool history.


Regards
Maurizio
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Bob Bishop
Hi,

On 18 Jul 2013, at 08:29, Dr Josef Karthauser wrote:

 Hi there,
 
 I'm scratching my head. I've just migrated to a super micro chassis and at 
 the same time gone from FreeBSD 9.0 to 9.1-RELEASE.
 
 The machine in question is running a ZFS mirror configuration on two ada 
 devices (with a 8gb gmirror carved out for swap).
 
 Since doing so I've been having strange drop outs on the drives; the just 
 disappear from the bus like so:
 
 (ada2:ahcich2:0:0:0): removing device entry
 (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error
 (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT )
 (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff
 (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted
 (aprobe0:ahcich2:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
 (aprobe0:ahcich2:0:0:0): CAM status: ATA Status Error
 (aprobe0:ahcich2:0:0:0): ATA status: d1 (BSY DRDY SERV ERR), error: 04 (ABRT )
 (aprobe0:ahcich2:0:0:0): RES: d1 04 ff ff ff ff ff ff ff ff ff
 (aprobe0:ahcich2:0:0:0): Error 5, Retries exhausted
 
 
 At first I though it was a failing drive - one of the drives did this, and I 
 limped on a single drive for a week until I could get someone up to the rack 
 to plug a third drive in.  We resilvered the zpool onto the new device and 
 ran with the failed drive still plugged in (but not responding to a reset on 
 the ada bus with camcontrol) for a week or so.
 
 Then, the new drive dropped out in exactly the same way, followed in short 
 order by the remaining original drive!!!
 
 After rebooting the machine, and observing all three drives probing and 
 available, I resilvered the gmirror and zpool again on the two devices 
 expected that I thought were reliable, but before the resilvering was 
 completed the new drive dropped out again.
 
 I'm scratching my head now. I can't imagine that it's a wiring problem, as 
 they are all on individual SATA buses and individually cabled.
 
 Smart isn't reporting an drive issues either…. :/
 
 So, I'm wondering, is it a driver issuer with 9.1-RELEASE, if I upgrade to 
 9-RELENG would I expect that to resolve the problem?  (Have there been any 
 reported ada bus issuer reported since last December?)
 
 The hardware in question is:
 
 ahci0: Intel Cougar Point AHCI SATA controller port 
 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f mem 
 0xdfb02000-0xdfb027ff irq 19 at device 31.2 on pci0
 ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
 ahcich0: AHCI channel at channel 0 on ahci0
 ahcich1: AHCI channel at channel 1 on ahci0
 ahcich2: AHCI channel at channel 2 on ahci0
 ahcich3: AHCI channel at channel 3 on ahci0
 ahcich4: AHCI channel at channel 4 on ahci0
 ahcich5: AHCI channel at channel 5 on ahci0
 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
 ada0: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
 ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
 ada0: Command Queueing enabled
 ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
 ada0: Previously was known as ad4
 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
 ada1: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
 ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
 ada1: Command Queueing enabled
 ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
 ada1: Previously was known as ad6
 ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
 ada2: WDC WD1000FYPS-01ZKB0 02.01B01 ATA-8 SATA 2.x device
 ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
 ada2: Command Queueing enabled
 ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
 ada2: Previously was known as ad8
 
 
 Any ideas would be greatly welcomed.
 
 Thanks,
 Joe

Me too (over a long period, with various hardware).

There is a general problem with energy-saving drives that controllers don't 
understand them. Typically the drive decides to go into some power-saving mode, 
the controller wants to do some operation, the drive takes too long to come 
ready, the controller decides the drive has gone away.

You have to persuade the controller to wait longer for the drive to come ready, 
and/or persuade the drive to stay awake. This isn't necessarily easy, eg the 
controller's ready wait may not be programmable.

(Or avoid such drives like the plague, life's too short).

--
Bob Bishop
r...@gid.co.uk




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Seeing data corruption with g_multipath utility

2013-07-18 Thread Sowmya L
Hi Steven,

Read/Write errors are recorded when an active path of the geom_multipath
device is pulled while running the i/o on dataset created for the pool.
Running I/o on dataset using dd.

*Freebsd version :* 9.0
*Patch imported from stable 9 : *r229303, r234916

*zpool status:*
**
   pool: poola

   state: ONLINE

   scan: none requested

   config:

NAMESTATE READ WRITE CKSUM

poola   ONLINE   0  0 0

  mirror-0ONLINE   0 0 0

   multipath/newdisk4   ONLINE   0 0 0

multipath/newdisk2   ONLINE   0 00


  errors: No known data errors

*
*
* *

*gmultipath status:*
*   *
  Name   Status  Components

multipath/newdisk2  OPTIMAL  da7 (ACTIVE)

  da2 (PASSIVE)

multipath/newdisk1  OPTIMAL  da6 (ACTIVE)

  da1 (PASSIVE)

multipath/newdisk4  OPTIMAL  da3 (ACTIVE)

  da4 (PASSIVE)

 multipath/newdisk  OPTIMAL  da0 (ACTIVE)

  da5 (PASSIVE)

multipath/newdisk3  OPTIMAL  da8 (ACTIVE)

  da9 (PASSIVE)


*  *




*zpool status after pulling the active path g_multipath device*:

   pool: mypool1

  state: ONLINE

  status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.

  action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.

   see: http://www.sun.com/msg/ZFS-8000-9P

   scan: resilvered 27.2M in 0h0m with 0 errors on Thu Jul  4 19:47:44 2013

   config:

NAMESTATE READ WRITE CKSUM

mypool1 ONLINE   0 0 0

  mirror-0  ONLINE   012 0

multipath/newdisk4  ONLINE   027 0

multipath/newdisk2  ONLINE   012 0

spares
  multipath/newdisk AVAIL



errors: No known data errors

Are there any dependencies for the patch that is picked from stable 9 as
mentioned above??
sorry for posting it in many areas of the forum its just to find the rite
fit.

-- 
Thanks  Regards,
Sowmya L
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Andrea Venturoli

On 07/18/13 10:25, Bob Bishop wrote:


Me too (over a long period, with various hardware).

There is a general problem with energy-saving drives that controllers don't 
understand them. Typically the drive decides to go into some power-saving mode, 
the controller wants to do some operation, the drive takes too long to come 
ready, the controller decides the drive has gone away.

You have to persuade the controller to wait longer for the drive to come ready, 
and/or persuade the drive to stay awake. This isn't necessarily easy, eg the 
controller's ready wait may not be programmable.

(Or avoid such drives like the plague, life's too short).


Perhaps they are WD Green drives?

In that case, other than quoting Bob's suggestion about avoiding them, 
there's something you can do:
a) turn off the drives' power-saving features (this is done through a 
DOS utility you can download);

b) try different controllers and/or different OS releases.

You'll find a lot on this problem if you search the web.
There's also a report of mine you can search on this ML, regarding 
FreeBSD specifically.


HTH.

 bye
av.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread Andriy Gapon
on 17/07/2013 23:47 George Hartzell said the following:
 How should I move forward with this?

Could you please try to reproduce this problem using a kernel built with
INVARIANTS options?

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread John Reynolds

On 7/18/2013 12:36 AM, Steven Hartland wrote:

Is the SSD running at 6Gbps? If so have you tried limiting the speed
to 3Gbps?
I would imagine so. It has a 6Gbps interface and the Z87 board does 
also--so I can only imagine it's trying to go as fast as possible by 
default. I will fiddle with the BIOS and see if I can limit it and see 
if that prevents the timeout problems I was having after the 9.1-R 
install. Somebody else suggested changing the USB settings in the BIOS 
to try and overcome the problem when using the snap of 9.2-PRERELEASE as 
well so I will try that too. Ultimately if nothing likes the SSD I will 
try a regular SATA drive as last resort.


The drive is an Intel 520 Series 240Gb model, FWIW.

Thanks!

-Jr

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread Glen Barber
On Wed, Jul 17, 2013 at 08:49:39PM -0700, John Reynolds wrote:
 On 7/17/2013 5:48 PM, Glen Barber wrote:
 Hi John,
 
 
 Do you have a SATA drive you can try with this hardware?  It would be
 useful to know if that works, or same errors, etc.
 
 Glen
 
 I do, and that was my plan for tomorrow. A bit under the weather
 today. I will most definitely report back any findings. Thanks for
 your reply!
 

John, in addition to suggestions/replies from others, can you also try
the 10.0-CURRENT snapshot?  In particular, if your problem continues
with a SATA drive, I am curious if the problem still exists in head/.

http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/

Thanks.

Glen



pgpiuCY4I5SKv.pgp
Description: PGP signature


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread Steven Hartland

You can use: hint.ahcich.X.sata_rev to limit the speed

See man for ahci(4) for more details.

- Original Message - 
From: John Reynolds john...@reynoldsnet.org

To: Steven Hartland kill...@multiplay.co.uk
Cc: sta...@freebsd.org
Sent: Thursday, July 18, 2013 4:48 PM
Subject: Re: experience with 9.2-PRERELEASE



On 7/18/2013 12:36 AM, Steven Hartland wrote:

Is the SSD running at 6Gbps? If so have you tried limiting the speed
to 3Gbps?
I would imagine so. It has a 6Gbps interface and the Z87 board does 
also--so I can only imagine it's trying to go as fast as possible by 
default. I will fiddle with the BIOS and see if I can limit it and see 
if that prevents the timeout problems I was having after the 9.1-R 
install. Somebody else suggested changing the USB settings in the BIOS 
to try and overcome the problem when using the snap of 9.2-PRERELEASE as 
well so I will try that too. Ultimately if nothing likes the SSD I will 
try a regular SATA drive as last resort.


The drive is an Intel 520 Series 240Gb model, FWIW.

Thanks!

-Jr





This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread John Reynolds

John, in addition to suggestions/replies from others, can you also try
the 10.0-CURRENT snapshot?  In particular, if your problem continues
with a SATA drive, I am curious if the problem still exists in head/.

 http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/

Thanks.

Yes, I can try that. I will try a basic install with -current and see if 
a) the USB problem exists b) if I am getting the timeout errors with 
the SSD before I go off and tweak speed settings / USB settings.


-jr

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread Glen Barber
On Thu, Jul 18, 2013 at 08:48:28AM -0700, John Reynolds wrote:
 On 7/18/2013 12:36 AM, Steven Hartland wrote:
 Is the SSD running at 6Gbps? If so have you tried limiting the speed
 to 3Gbps?
 I would imagine so. It has a 6Gbps interface and the Z87 board does
 also--so I can only imagine it's trying to go as fast as possible by
 default. I will fiddle with the BIOS and see if I can limit it and
 see if that prevents the timeout problems I was having after the
 9.1-R install. Somebody else suggested changing the USB settings in
 the BIOS to try and overcome the problem when using the snap of
 9.2-PRERELEASE as well so I will try that too. Ultimately if nothing
 likes the SSD I will try a regular SATA drive as last resort.
 
 The drive is an Intel 520 Series 240Gb model, FWIW.
 

FWIW, I had one of these in my laptop for about a year, and can confirm
that it should work fine, at least on 10.0-CURRENT.

Glen



pgp945L_tmvIq.pgp
Description: PGP signature


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread George Hartzell
Andriy Gapon writes:
  on 17/07/2013 23:47 George Hartzell said the following:
   How should I move forward with this?
  
  Could you please try to reproduce this problem using a kernel built with
  INVARIANTS options?

I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC
kernel, rebuilt it, installed it and running through my test case
generated a lot of invalid flac files.  Im not sure what the options
are/were supposed to do though, it looks like they generally lead to
KASSERTS, which lead to abort()'s.  Nothing in /var/log/messages or on
the console.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread George Hartzell
Richard Todd writes:
  George Hartzell hartz...@alerce.com writes:
  
   Hi All,
  
   I have what I think is a ZFS related bug.
   [...]
 
  [summary: Picard seems to trigger an mmap consistency bug in ZFS].
  
  [...]
  Anyway, what I'd suggest is the following: see if my patch for py-mutagen
  disabling the mmap() in those two functions lets you run picard reliably.

Removing the mmap support from those two routines seems to avoid the
issue.

  If so, then the issue is triggered by one or both of those two routines;
  hack them to print out the exact offsets used on each call and use that to 
  try and code up a simple C++ test case.  
  [...]

Your test case doesn't use mmap, I assume that you've offered it up as
a hint, not as something that's nearly done.  The shell script in
particular seems useful.

In my case I'd want to find a particular set of file size, offset, and
insertion size that triggers the problem and code up a c/c++ equiv. of
the mmap calls that py-mutagen does.  Right?

I'm hesistant about that.  I believe (and will try to prove) that the
problem does not occur deterministically for a particular track
between different test runs.  I'm worried that it's not as simple as
using mmap to insert 27 bytes into a 1024 bytes file at pos 42 causes
corruption but rather that it depends on a more complex set of
interactions.

My next step will be to see if a track that has trouble in one run has
trouble in another.  If not, then I'm not sure that a simple test will
be successful.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread Alexander Yerenkow
I know how all not loving me-too emails, but I'll try :)
There's a rtorrent, which uses mmap. And I had cases (related to reboot),
where big files
(or average files in many-files torrents) appears with broken checksum
without any good reason.
Author of rtorrent not very politely always assume that really broken other
filesystems, while rtorrent have simple logic (with mmap using)...

This post is pretty interesting:
http://libtorrent.rakshasa.no/ticket/483

In the end I just switched to transmission.

Anyway, there could be really some weird/rare bugs with ZFS and mmap. Or
just ZFS.
I hope this will help at least to narrow direction for potential bugbusting.



2013/7/18 George Hartzell hartz...@alerce.com

 Richard Todd writes:
   George Hartzell hartz...@alerce.com writes:
  
Hi All,
   
I have what I think is a ZFS related bug.
[...]
  
   [summary: Picard seems to trigger an mmap consistency bug in ZFS].
  
   [...]
   Anyway, what I'd suggest is the following: see if my patch for
 py-mutagen
   disabling the mmap() in those two functions lets you run picard
 reliably.

 Removing the mmap support from those two routines seems to avoid the
 issue.

   If so, then the issue is triggered by one or both of those two routines;
   hack them to print out the exact offsets used on each call and use that
 to
   try and code up a simple C++ test case.
   [...]

 Your test case doesn't use mmap, I assume that you've offered it up as
 a hint, not as something that's nearly done.  The shell script in
 particular seems useful.

 In my case I'd want to find a particular set of file size, offset, and
 insertion size that triggers the problem and code up a c/c++ equiv. of
 the mmap calls that py-mutagen does.  Right?

 I'm hesistant about that.  I believe (and will try to prove) that the
 problem does not occur deterministically for a particular track
 between different test runs.  I'm worried that it's not as simple as
 using mmap to insert 27 bytes into a 1024 bytes file at pos 42 causes
 corruption but rather that it depends on a more complex set of
 interactions.

 My next step will be to see if a track that has trouble in one run has
 trouble in another.  If not, then I'm not sure that a simple test will
 be successful.

 g.
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




-- 
Regards,
Alexander Yerenkow
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Dr Josef Karthauser

On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote:

 Perhaps they are WD Green drives?

They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green or 
not.

 In that case, other than quoting Bob's suggestion about avoiding them, 
 there's something you can do:
 a) turn off the drives' power-saving features (this is done through a DOS 
 utility you can download);
 b) try different controllers and/or different OS releases.

I'm committed to FreeBSD, as the machine is already rolled out and in a data 
centre ;).

 You'll find a lot on this problem if you search the web.
 There's also a report of mine you can search on this ML, regarding FreeBSD 
 specifically.

I'll see if I can find it. Thanks.

Joe

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Dr Josef Karthauser
On 18 Jul 2013, at 08:33, Steven Hartland kill...@multiplay.co.uk wrote:

 What chassis is this?

Hey Steven,

It's a Supermicro CSE-813MTQ-350CB.

Cheers,
Joe

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread Richard Todd
On Thu, Jul 18, 2013 at 11:40:51AM -0700, George Hartzell wrote:
 Removing the mmap support from those two routines seems to avoid the
 issue.

Aha. 

   If so, then the issue is triggered by one or both of those two routines;
   hack them to print out the exact offsets used on each call and use that to 
   try and code up a simple C++ test case.  
   [...]
 
 Your test case doesn't use mmap, I assume that you've offered it up as
 a hint, not as something that's nearly done.  The shell script in
 particular seems useful.

Um, go look at gen4.cpp again.  It uses mmap().  The insert_bytes and
delete_bytes functions should work the same way as the (mmap-using path of)
the functions of the same name in py-mutagen. 


 In my case I'd want to find a particular set of file size, offset, and
 insertion size that triggers the problem and code up a c/c++ equiv. of
 the mmap calls that py-mutagen does.  Right?

Yeah. 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Charles Swiger
Hi--

On Jul 18, 2013, at 12:13 PM, Dr Josef Karthauser j...@karthauser.co.uk wrote:
 On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote:
 
 Perhaps they are WD Green drives?
 
 They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green 
 or not.

Yes, those are WDC's Green drives, although they are also the higher grade 
version as compared to standard desktop drives which are supposed to have 
firmware which plays nice with RAID (TLER, time-limited error recovery).

Updating the firmware and increasing the timeout before these spin down 
automagically is likely to help, but as Andrea noted, such drives do have quite 
a history of timeout problems due to excessive head parking and their power 
conservation attempts.

Regards,
-- 
-Chuck

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Dr Josef Karthauser

On 18 Jul 2013, at 20:31, Charles Swiger cswi...@mac.com wrote:

 Hi--
 
 On Jul 18, 2013, at 12:13 PM, Dr Josef Karthauser j...@karthauser.co.uk 
 wrote:
 On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote:
 
 Perhaps they are WD Green drives?
 
 They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green 
 or not.
 
 Yes, those are WDC's Green drives, although they are also the higher grade 
 version as compared to standard desktop drives which are supposed to have 
 firmware which plays nice with RAID (TLER, time-limited error recovery).
 
 Updating the firmware and increasing the timeout before these spin down 
 automagically is likely to help, but as Andrea noted, such drives do have 
 quite a history of timeout problems due to excessive head parking and their 
 power conservation attempts.

We also wondered whether it was the motherboard, and so we've replaced it! Hope 
that that works!

But, from what's being said here, it looks like that might not be the case. :/ 
Although, we've been up for 5 days now with no recurrences of the previous 
issue.

Joe
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Dr Josef Karthauser
On 18 Jul 2013, at 20:31, Charles Swiger cswi...@mac.com wrote:
 On Jul 18, 2013, at 12:13 PM, Dr Josef Karthauser j...@karthauser.co.uk 
 wrote:
 On 18 Jul 2013, at 13:07, Andrea Venturoli m...@netfence.it wrote:
 
 Perhaps they are WD Green drives?
 
 They're WD RE2-GP 1 TB drives (model WD1000FYPS) , not sure if that's green 
 or not.
 
 Yes, those are WDC's Green drives, although they are also the higher grade 
 version as compared to standard desktop drives which are supposed to have 
 firmware which plays nice with RAID (TLER, time-limited error recovery).
 
 Updating the firmware and increasing the timeout before these spin down 
 automagically is likely to help, but as Andrea noted, such drives do have 
 quite a history of timeout problems due to excessive head parking and their 
 power conservation attempts.

They're currently on firmware 02.01B01, btw. Not sure if that's the latest or 
not.

Joe

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Steven Hartland
- Original Message - 
From: Dr Josef Karthauser j...@karthauser.co.uk

On 18 Jul 2013, at 08:33, Steven Hartland kill...@multiplay.co.uk wrote:


What chassis is this?


Hey Steven,

It's a Supermicro CSE-813MTQ-350CB.


We've seen issues on supermicro chassis before which cause
timeouts and in extreme cases device drops so if you can try
wiring the disks up directly to the MB via sata cables bypassing
the hotswap midplane and see if that helps.

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Andrea Venturoli

On 07/18/13 21:13, Dr Josef Karthauser wrote:


b) try different controllers and/or different OS releases.


I'm committed to FreeBSD, as the machine is already rolled out and in a data 
centre ;).


I said different OS releases, not different OS!
I wouln't say such a blasphemy :)

 bye
av.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Drive failures with ada on FreeBSD-9.1, driver bug or wiring issue?

2013-07-18 Thread Andrea Venturoli

On 07/18/13 21:31, Charles Swiger wrote:



Updating the firmware and increasing the timeout before these spin down

  automagically is likely to help, but as Andrea noted, such drives do
 have quite a history of timeout problems due to excessive head parking
 and their power conservation attempts.

Just for the record, I've been using them for several months without a 
hitch; it's just a matter of finding the correct settings/firmware/OS 
version/controller.
This is to say you should be able to get them to work, altough you might 
require some luck (or some sort of divination).


 bye
av.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread John Reynolds

On 7/18/2013 8:49 AM, Glen Barber wrote:

On Wed, Jul 17, 2013 at 08:49:39PM -0700, John Reynolds wrote:

today. I will most definitely report back any findings. Thanks for
your reply!


John, in addition to suggestions/replies from others, can you also try
the 10.0-CURRENT snapshot?  In particular, if your problem continues
with a SATA drive, I am curious if the problem still exists in head/.

 http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/

Thanks.


Thanks to Glen and Steve and all those who replied to my initial 
posting. I am quite pleased after having been figuratively gone from 
(but still running) FreeBSD for coming up on 2 years (hey--when it's so 
rock solid, it's easy when you don't have to ask questions :) that i can 
come back to these lists and still find lots of people willing to help! 
You guys (and gals) are awesome!


Anyway, it seems I have solved my problems with this new hardware. There 
was really two issues I bumped into. After installing 9.1-R I saw this 
timeout problem to ahcich1 device and it would hang for about 5 
minutes then continue to boot. Then I tried 9.2-PRE but there the 
keyboard was not recognized and I kept getting infinite ugen0.2: 
Unknown . (disconnected) errors at the install screen. One person 
said to fiddle in the BIOS with the USB settings. I went into the BIOS 
and disabled Intel USB 3.0 Mode support (this is a very new 
motherboard with USB 3.0). That fixed the issue! For those curious about 
-current, I also tried to install that and it experienced the same USB 
problems until I disabled this 3.0 mode.


Once I got past that and installed 9.2-PRE again, I let it timeout again 
during boot. Then I looked at the dmesg output much more closely than I 
could have during the boot from DVD. It turns out that I only THOUGHT it 
was a timeout to the SSD. The Intel Series 520 SSD was NOT the issue. I 
was getting a timeout on ahcich1 which from the dmesg output was my DVD 
burner! The SSD was on ahcich5. The only difference here is that by 
sheer happenstance, I connected the SSD to the SATA port owned by the 
Intel Z87 (Lynx Point) controller. The DVD burner (which is probably 5 
years old) was hooked to ASRock's own ASMedia ASM1061 controller. 
Apparently this controller doesn't play well with my older DVD burner. I 
switched ports and put both on the Intel controller and BOOM! Success! 
No timeouts, no funny business. Nothing. At this point i don't mind at 
all turning off the USB 3.0 mode because I have yet to actually see a 
3.0 device in the wild and I certainly won't own one anytime soon. :) 
 so it's kind of a moot point. I'm sure the USB developers will see 
this problem and tackle it in time. I'm just thrilled to be booting on 
this new H/W!


Thanks again to all on the list that replied!

-Jr

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread Glen Barber
On Thu, Jul 18, 2013 at 01:28:10PM -0700, John Reynolds wrote:
 On 7/18/2013 8:49 AM, Glen Barber wrote:
 On Wed, Jul 17, 2013 at 08:49:39PM -0700, John Reynolds wrote:
 today. I will most definitely report back any findings. Thanks for
 your reply!
 
 John, in addition to suggestions/replies from others, can you also try
 the 10.0-CURRENT snapshot?  In particular, if your problem continues
 with a SATA drive, I am curious if the problem still exists in head/.
 
  http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/
 
 Thanks.
 
 Thanks to Glen and Steve and all those who replied to my initial
 posting. I am quite pleased after having been figuratively gone
 from (but still running) FreeBSD for coming up on 2 years (hey--when
 it's so rock solid, it's easy when you don't have to ask questions
 :) that i can come back to these lists and still find lots of people
 willing to help! You guys (and gals) are awesome!
 
 Anyway, it seems I have solved my problems with this new hardware.
 There was really two issues I bumped into. After installing 9.1-R I
 saw this timeout problem to ahcich1 device and it would hang for
 about 5 minutes then continue to boot. Then I tried 9.2-PRE but
 there the keyboard was not recognized and I kept getting infinite
 ugen0.2: Unknown . (disconnected) errors at the install
 screen. One person said to fiddle in the BIOS with the USB settings.
 I went into the BIOS and disabled Intel USB 3.0 Mode support (this
 is a very new motherboard with USB 3.0). That fixed the issue! For
 those curious about -current, I also tried to install that and it
 experienced the same USB problems until I disabled this 3.0 mode.
 
 Once I got past that and installed 9.2-PRE again, I let it timeout
 again during boot. Then I looked at the dmesg output much more
 closely than I could have during the boot from DVD. It turns out
 that I only THOUGHT it was a timeout to the SSD. The Intel Series
 520 SSD was NOT the issue. I was getting a timeout on ahcich1 which
 from the dmesg output was my DVD burner! The SSD was on ahcich5. The
 only difference here is that by sheer happenstance, I connected the
 SSD to the SATA port owned by the Intel Z87 (Lynx Point) controller.
 The DVD burner (which is probably 5 years old) was hooked to
 ASRock's own ASMedia ASM1061 controller. Apparently this
 controller doesn't play well with my older DVD burner. I switched
 ports and put both on the Intel controller and BOOM! Success! No
 timeouts, no funny business. Nothing. At this point i don't mind at
 all turning off the USB 3.0 mode because I have yet to actually see
 a 3.0 device in the wild and I certainly won't own one anytime soon.
 :)  so it's kind of a moot point. I'm sure the USB developers
 will see this problem and tackle it in time. I'm just thrilled to be
 booting on this new H/W!
 
 Thanks again to all on the list that replied!
 

Great to hear.  Thank you for trying different suggestions, and
reporting back the results.  I'm glad things worked out (though, I'm
still a bit worried about the xhci(4) in your case...).

Glen



pgpvzD9_AJom3.pgp
Description: PGP signature


Re: experience with 9.2-PRERELEASE

2013-07-18 Thread Adam Vande More
On Thu, Jul 18, 2013 at 3:28 PM, John Reynolds john...@reynoldsnet.orgwrote:

 One person said to fiddle in the BIOS with the USB settings. I went into
 the BIOS and disabled Intel USB 3.0 Mode support (this is a very new
 motherboard with USB 3.0). That fixed the issue! For those curious about
 -current, I also tried to install that and it experienced the same USB
 problems until I disabled this 3.0 mode.


This is worth pursuing on freebsdf-usb@ or by filing a PR.  Only way it
gets fixed is if the right people know about it.

-- 
Adam Vande More
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


9.2PRERELEASE ZFS panic in lzjb_compress

2013-07-18 Thread olivier
Hi,
Running 9.2-PRERELEASE #19 r253313 I got the following panic

Fatal trap 12: page fault while in kernel mode
cpuid = 22; apic id = 46
fault virtual address   = 0xff827ebca30c
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x81983055
stack pointer   = 0x28:0xffcf75bd60a0
frame pointer   = 0x28:0xffcf75bd68f0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (zio_write_issue_hig)
trap number = 12
panic: page fault
cpuid = 22
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame
0xffcf75bd5b30
kdb_backtrace() at kdb_backtrace+0x37/frame 0xffcf75bd5bf0
panic() at panic+0x1ce/frame 0xffcf75bd5cf0
trap_fatal() at trap_fatal+0x290/frame 0xffcf75bd5d50
trap_pfault() at trap_pfault+0x211/frame 0xffcf75bd5de0
trap() at trap+0x344/frame 0xffcf75bd5fe0
calltrap() at calltrap+0x8/frame 0xffcf75bd5fe0
--- trap 0xc, rip = 0x81983055, rsp = 0xffcf75bd60a0, rbp =
0xffcf75bd68f0 ---
lzjb_compress() at lzjb_compress+0x185/frame 0xffcf75bd68f0
zio_compress_data() at zio_compress_data+0x92/frame 0xffcf75bd6920
zio_write_bp_init() at zio_write_bp_init+0x24b/frame 0xffcf75bd6970
zio_execute() at zio_execute+0xc3/frame 0xffcf75bd69b0
taskqueue_run_locked() at taskqueue_run_locked+0x74/frame 0xffcf75bd6a00
taskqueue_thread_loop() at taskqueue_thread_loop+0x46/frame
0xffcf75bd6a20
fork_exit() at fork_exit+0x11f/frame 0xffcf75bd6a70
fork_trampoline() at fork_trampoline+0xe/frame 0xffcf75bd6a70
--- trap 0, rip = 0, rsp = 0xffcf75bd6b30, rbp = 0 ---

lzjb_compress+0x185 corresponds to line 85 in
80 cpy = src - offset;
81 if (cpy = (uchar_t *)s_start  cpy != src 
82src[0] == cpy[0]  src[1] == cpy[1]  src[2] == cpy[2]) {
83 *copymap |= copymask;
84 for (mlen = MATCH_MIN; mlen  MATCH_MAX; mlen++)
85 if (src[mlen] != cpy[mlen])
86 break;
87 *dst++ = ((mlen - MATCH_MIN)  (NBBY - MATCH_BITS)) |
88(offset  NBBY);
89 *dst++ = (uchar_t)offset;

I think it's the first time I've seen this panic. It happened while doing a
send/receive. I have two pools with lzjb compression; I don't know which of
these pools caused the problem, but one of them was the source of the
send/receive.

I only have a textdump but I'm happy to try to provide more information
that could help anyone look into this.
Thanks
Olivier
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-18 Thread Andre Albsmeier
On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote:
 On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote:
  OK, patch is applied. I will reboot the machine later
  and see what happens tomorrow in the morning. However,
  it might take a few days since the last 2 weeks all was
  fine.
  
  BTW, should this patch be used in general or is it just
  for debugging? My understanding is that it is something
  which could stay in the code...
 
 Patch is to improve debugging.
 
 I probably commit it after the issue is closed.  Arguments against
 the commit is that the change imposes small performance penalty
 due to save and restore of the %ebp (I doubt that this is measureable
 by any means).  Also, arguably, such change should be done for all
 functions in support.s, but bcopy() is the hot spot.

Thanks to this patch, we (you ;-) were able to track down the problem.
So how are we going to deal with this debugging patch itself?
My suggestion would be to #ifdef it somehow so on one hand it will
be availabe in future (and with bcopy being used a lot probability
is high it might help in other places), on the other hand it won't
steal cycles during normal use.

-Andre
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org