Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-20 Thread Travis Tabbal
The latter, we run these VMs over NFS anyway and had ESXi boxes under test already. we were already separating data exports from VM exports. We use an in-house developed configuration management/bare metal system which allows us to install new machines pretty easily. In this case we just

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-19 Thread Jeroen Roodhart
How did your migration to ESXi go? Are you using it on the same hardware or did you just switch that server to an NFS server and run the VMs on another box? The latter, we run these VMs over NFS anyway and had ESXi boxes under test already. we were already separating data exports from VM

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-12 Thread Jeroen Roodhart
I'm running nv126 XvM right now. I haven't tried it without XvM. Without XvM we do not see these issues. We're running the VMs through NFS now (using ESXi)... -- This message posted from opensolaris.org ___ zfs-discuss mailing list

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-12 Thread Travis Tabbal
I'm running nv126 XvM right now. I haven't tried it without XvM. Without XvM we do not see these issues. We're running the VMs through NFS now (using ESXi)... Interesting. It sounds like it might be an XvM specific bug. I'm glad I mentioned that in my bug report to Sun. Hopefully they

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-12 Thread James C. McPherson
Travis Tabbal wrote: I'm running nv126 XvM right now. I haven't tried it without XvM. Without XvM we do not see these issues. We're running the VMs through NFS now (using ESXi)... Interesting. It sounds like it might be an XvM specific bug. I'm glad I mentioned that in my bug report to Sun.

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-03 Thread Jeroen Roodhart
We see the same issue on a x4540 Thor system with 500G disks: lots of: ... Nov 3 16:41:46 uva.nl scsi: [ID 107833 kern.warning] WARNING: /p...@3c,0/pci10de,3...@f/pci1000,1...@0 (mpt5): Nov 3 16:41:46 encore.science.uva.nl Disconnected command timeout for Target 7 ... This system is

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-31 Thread Travis Tabbal
I am also running 2 of the Supermicro cards. I just upgraded to b126 and it seems improved. I am running a large file copy locally. I get these warnings in the dmesg log. When I do, I/O seems to stall for about 60sec. It comes back up fine, but it's very annoying. Any hints? I have 4 disks per

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-26 Thread David Turnbull
I'm having similar issues, with two AOC-USAS-L8i Supermicro 1068e cards mpt2 and mpt3, running 1.26.00.00IT It seems to only affect a specific revision of disk. (???) sd67 Soft Errors: 0 Hard Errors: 127 Transport Errors: 3416 Vendor: ATA Product: WDC WD10EACS-00D Revision: 1A01

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-25 Thread Adam Cheal
So, while we are working on resolving this issue with Sun, let me approach this from the another perspective: what kind of controller/drive ratio would be the minimum recommended to support a functional OpenSolaris-based archival solution? Given the following: - the vast majority of IO to the

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Markus Kovero
Richard Elling [richard.ell...@gmail.com] puolesta Lähetetty: 24. lokakuuta 2009 7:36 Vastaanottaja: Adam Cheal Kopio: zfs-discuss@opensolaris.org Aihe: Re: [zfs-discuss] SNV_125 MPT warning in logfile ok, see below... On Oct 23, 2009, at 8:14 PM, Adam Cheal wrote: Here is example of the pool

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Adam Cheal
The iostat I posted previously was from a system we had already tuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in actv per disk). I reset this value in /etc/system to 7, rebooted, and started a scrub. iostat output showed busier disks (%b is higher,

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Markus Kovero
Lähettäjä: zfs-discuss-boun...@opensolaris.org [zfs-discuss-boun...@opensolaris.org] k#228;ytt#228;j#228;n Adam Cheal [ach...@pnimedia.com] puolesta Lähetetty: 24. lokakuuta 2009 12:49 Vastaanottaja: zfs-discuss@opensolaris.org Aihe: Re: [zfs-discuss] SNV_125 MPT

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Tim Cook
On Sat, Oct 24, 2009 at 4:49 AM, Adam Cheal ach...@pnimedia.com wrote: The iostat I posted previously was from a system we had already tuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in actv per disk). I reset this value in /etc/system to 7, rebooted,

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Tim Cook
On Sat, Oct 24, 2009 at 11:20 AM, Tim Cook t...@cook.ms wrote: On Sat, Oct 24, 2009 at 4:49 AM, Adam Cheal ach...@pnimedia.com wrote: The iostat I posted previously was from a system we had already tuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Richard Elling
more below... On Oct 24, 2009, at 2:49 AM, Adam Cheal wrote: The iostat I posted previously was from a system we had already tuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in actv per disk). I reset this value in /etc/system to 7, rebooted, and

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Carson Gaspar
On 10/24/09 9:43 AM, Richard Elling wrote: OK, here we see 4 I/Os pending outside of the host. The host has sent them on and is waiting for them to return. This means they are getting dropped either at the disk or somewhere between the disk and the controller. When this happens, the sd driver

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Tim Cook
On Sat, Oct 24, 2009 at 12:30 PM, Carson Gaspar car...@taltos.org wrote: I saw this with my WD 500GB SATA disks (HDS725050KLA360) and LSI firmware 1.28.02.00 in IT mode, but I (almost?) always had exactly 1 stuck I/O. Note that my disks were one per channel, no expanders. I have _not_ seen it

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Adam Cheal
The controller connects to two disk shelves (expanders), one per port on the card. If you look back in the thread, you'll see our zpool config has one vdev per shelf. All of the disks are Western Digital (model WD1002FBYS-18A6B0) 1TB 7.2K, firmware rev. 03.00C06. Without actually matching up

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Bruno Sousa
Hi Cindy, I have a couple of questions about this issue : 1. i have exactly the same LSI controller in another server running opensolaris snv_101b, and so far no errors like this ones where seen in the system 2. up to snv_118 i haven't seen any problems, only now within snv_125

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Bruno Sousa
Hi Adam, How many disks and zpoo/zfs's do you have behind that LSI? I have a system with 22 disks and 4 zpools with around 30 zfs's and so far it works like a charm, even during heavy load. The opensolaris release is snv_101b . Bruno Adam Cheal wrote: Cindy: How can I view the bug report

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
Our config is: OpenSolaris snv_118 x64 1 x LSISAS3801E controller 2 x 23-disk JBOD (fully populated, 1TB 7.2k SATA drives) Each of the two external ports on the LSI connects to a 23-disk JBOD. ZFS-wise we use 1 zpool with 2 x 22-disk raidz2 vdevs (1 vdev per JBOD). Each zpool has one ZFS

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Jeremy f
What bug# is this under? I'm having what I believe is the same problem. Is it possible to just take the mpt driver from a prior build in the time being? The below is from the load the zpool scrub creates. This is on a dell t7400 workstation with a 1068E oemed lsi. I updated the firmware to the

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Jeremy f
Sorry, running snv_123, indiana On Fri, Oct 23, 2009 at 11:16 AM, Jeremy f rysh...@gmail.com wrote: What bug# is this under? I'm having what I believe is the same problem. Is it possible to just take the mpt driver from a prior build in the time being? The below is from the load the zpool

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
Just submitted the bug yesterday, under advice of James, so I don't have a number you can refer to you...the change request number is 6894775 if that helps or is directly related to the future bugid. From what I seen/read this problem has been around for awhile but only rears its ugly head

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread James C. McPherson
Adam Cheal wrote: Just submitted the bug yesterday, under advice of James, so I don't have a number you can refer to you...the change request number is 6894775 if that helps or is directly related to the future bugid. From what I seen/read this problem has been around for awhile but only

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Bruno Sousa
Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of hidden problems found by Sun where the HBA resets, and due to market time pressure the quick and dirty solution was to spread the load over multiple HBA's instead of software fix? Just my 2 cents.. Bruno Adam Cheal wrote: Just

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Bruno Sousa
Hi Cindy, Thank you for the update, mas it seems like i can't see any information specific to that bug. I can only see bugs number 6702538 and 6615564, but according to their history, they have been fixed quite some time ago. Can you by any chance present the information about bug 6694909 ?

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Richard Elling
On Oct 23, 2009, at 1:48 PM, Bruno Sousa wrote: Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of hidden problems found by Sun where the HBA resets, and due to market time pressure the quick and dirty solution was to spread the load over multiple HBA's instead of software fix?

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook
On Fri, Oct 23, 2009 at 3:48 PM, Bruno Sousa bso...@epinfante.com wrote: Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of hidden problems found by Sun where the HBA resets, and due to market time pressure the quick and dirty solution was to spread the load over multiple HBA's

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
I don't think there was any intention on Sun's part to ignore the problem...obviously their target market wants a performance-oriented box and the x4540 delivers that. Each 1068E controller chip supports 8 SAS PHY channels = 1 channel per drive = no contention for channels. The x4540 is a

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook
On Fri, Oct 23, 2009 at 6:32 PM, Adam Cheal ach...@pnimedia.com wrote: I don't think there was any intention on Sun's part to ignore the problem...obviously their target market wants a performance-oriented box and the x4540 delivers that. Each 1068E controller chip supports 8 SAS PHY channels

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
LSI's sales literature on that card specs 128 devices which I take with a few hearty grains of salt. I agree that with all 46 drives pumping out streamed data, the controller would be overworked BUT the drives will only deliver data as fast as the OS tells them to. Just because the speedometer

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Richard Elling
On Oct 23, 2009, at 4:46 PM, Tim Cook wrote: On Fri, Oct 23, 2009 at 6:32 PM, Adam Cheal ach...@pnimedia.com wrote: I don't think there was any intention on Sun's part to ignore the problem...obviously their target market wants a performance-oriented box and the x4540 delivers that. Each

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook
On Fri, Oct 23, 2009 at 7:17 PM, Adam Cheal ach...@pnimedia.com wrote: LSI's sales literature on that card specs 128 devices which I take with a few hearty grains of salt. I agree that with all 46 drives pumping out streamed data, the controller would be overworked BUT the drives will only

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook
On Fri, Oct 23, 2009 at 7:17 PM, Richard Elling richard.ell...@gmail.comwrote: Tim has a valid point. By default, ZFS will queue 35 commands per disk. For 46 disks that is 1,610 concurrent I/Os. Historically, it has proven to be relatively easy to crater performance or cause problems with

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
And therein lies the issue. The excessive load that causes the IO issues is almost always generated locally from a scrub or a local recursive ls used to warm up the SSD-based zpool cache with metadata. The regular network IO to the box is minimal and is very read-centric; once we load the box

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Richard Elling
On Oct 23, 2009, at 5:32 PM, Tim Cook wrote: On Fri, Oct 23, 2009 at 7:17 PM, Richard Elling richard.ell...@gmail.com wrote: Tim has a valid point. By default, ZFS will queue 35 commands per disk. For 46 disks that is 1,610 concurrent I/Os. Historically, it has proven to be relatively

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
Here is example of the pool config we use: # zpool status pool: pool002 state: ONLINE scrub: scrub stopped after 0h1m with 0 errors on Fri Oct 23 23:07:52 2009 config: NAME STATE READ WRITE CKSUM pool002 ONLINE 0 0 0 raidz2 ONLINE

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Richard Elling
ok, see below... On Oct 23, 2009, at 8:14 PM, Adam Cheal wrote: Here is example of the pool config we use: # zpool status pool: pool002 state: ONLINE scrub: scrub stopped after 0h1m with 0 errors on Fri Oct 23 23:07:52 2009 config: NAME STATE READ WRITE CKSUM

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Cindy Swearingen
Hi Bruno, I see some bugs associated with these messages (6694909) that point to an LSI firmware upgrade that cause these harmless errors to display. According to the 6694909 comments, this issue is documented in the release notes. As they are harmless, I wouldn't worry about them. Maybe

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Adam Cheal
Cindy: How can I view the bug report you referenced? Standard methods show my the bug number is valid (6694909) but no content or notes. We are having similar messages appear with snv_118 with a busy LSI controller, especially during scrubbing, and I'd be interested to see what they mentioned

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread James C. McPherson
Adam Cheal wrote: Cindy: How can I view the bug report you referenced? Standard methods show my the bug number is valid (6694909) but no content or notes. We are having similar messages appear with snv_118 with a busy LSI controller, especially during scrubbing, and I'd be interested to see what

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Adam Cheal
James: We are running Phase 16 on our LSISAS3801E's, and have also tried the recently released Phase 17 but it didn't help. All firmware NVRAM settings are default. Basically, when we put the disks behind this controller under load (e.g. scrubbing, recursive ls on large ZFS filesystem) we get

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread James C. McPherson
Adam Cheal wrote: James: We are running Phase 16 on our LSISAS3801E's, and have also tried the recently released Phase 17 but it didn't help. All firmware NVRAM settings are default. Basically, when we put the disks behind this controller under load (e.g. scrubbing, recursive ls on large ZFS

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Adam Cheal
I've filed the bug, but was unable to include the prtconf -v output as the comments field only accepted 15000 chars total. Let me know if there is anything else I can provide/do to help figure this problem out as it is essentially preventing us from doing any kind of heavy IO to these pools,

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Carson Gaspar
On 10/22/09 4:07 PM, James C. McPherson wrote: Adam Cheal wrote: It seems to be timing out accessing a disk, retrying, giving up and then doing a bus reset? ... ugh. New bug time - bugs.opensolaris.org, please select Solaris / kernel / driver-mpt. In addition to the error messages and